📌  相关文章
📜  如何在 R 数据框中选择特定列?

📅  最后修改于: 2022-05-13 01:54:52.980000             🧑  作者: Mango

如何在 R 数据框中选择特定列?

在本文中,我们将讨论如何在 R 编程语言中从数据框中选择特定列。

方法 1:使用 Base R 按列名选择特定列

在这种选择特定列的方法中,用户需要根据需要将列名的名称与给定数据框的名称一起写在方括号中,以获取用户需要的那些特定列。

句法:

data_frame

例子:

R
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7),
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific Columns Using Base
# R by column name
gfg[c('b', 'd', 'e')]


R
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific Columns Using Base R 
# by column index
gfg[c(2, 4, 5)]


R
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting 
# data by column name
gfg[, c('b', 'd', 'e')]


R
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting data
# by column index:
gfg[, c(2, 4, 5)]


R
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by Subsetting 
# Data with select Argument of subset Function
subset(gfg, select=c('b', 'd', 'e'))


R
# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7),
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column name
gfg % > % select(b, d, e)


R
# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column index
gfg % > % select(2, 4, 5)


输出:

方法 2:使用 Base R 按列索引选择特定列

在这种选择特定列的方法中,用户需要使用方括号和给定的数据框,和。有了它,用户还需要使用方括号内的列索引,其中索引从1开始,并且根据用户的要求必须在括号内给出所需的列索引

句法:

data_frame

示例

R

# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific Columns Using Base R 
# by column index
gfg[c(2, 4, 5)]

输出:

方法 3:通过按列名子集数据来选择特定列

在这种通过子集数据选择特定列的方法中,用户需要指定一个包含要提取的列名的字符向量,用户必须在方框中输入与列名对应的字符向量用数据框括起来

句法:

data_frame[,c(column_name_1,column_name_2,...)]

例子:

R

# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting 
# data by column name
gfg[, c('b', 'd', 'e')]

输出:

方法 4:通过按列索引对数据进行子集化来选择特定列

在这种通过子集数据选择特定列的方法中,用户需要指定一个包含要提取的列索引的整数向量,用户必须在正方形中输入与列索引对应的索引向量用数据框括起来

句法:

data_frame[,c(column_index_1,column_index_2,...)]

例子:

R

# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting data
# by column index:
gfg[, c(2, 4, 5)]

输出:

方法 5:使用子集函数的选择参数通过子集数据选择特定列:

子集函数:该函数将返回满足条件的数据帧的子集。

例子:

R

# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by Subsetting 
# Data with select Argument of subset Function
subset(gfg, select=c('b', 'd', 'e'))

输出:

方法 6:使用 dplyr 包按列名选择特定列

在这种选择给定数据框的特定列的方法中,用户需要首先在用户的工作 R 控制台中安装和导入 dplyr 包,然后调用 select函数并将所需列的名称作为参数传递给这个函数

句法:

data_frame %>% select(column_name_1,column_name_2,...)   

例子:

R

# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7),
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column name
gfg % > % select(b, d, e)

输出:

方法 7:使用 dplyr 包按列索引选择特定列

在这种选择给定数据框的特定列的方法中,用户需要首先在用户的工作 R 控制台中安装和导入 dplyr 包,然后调用 select函数并将所需列的索引作为参数传递这个函数

句法:

data_frame %>% select(column_index_1,column_index_2,...)  

例子:

R

# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column index
gfg % > % select(2, 4, 5)

输出: