如何有条件地删除 R DataFrame 中的行？

在本文中，我们将讨论如何使用 R 编程语言有条件地从数据框中删除行。我们需要有条件地从数据框中删除一些数据行来准备数据。为此，我们使用逻辑条件，在此基础上删除不符合条件的数据。

方法一：按单个条件删除行

要基于单个条件语句从数据帧中删除数据行，我们在数据帧中使用方括号 [ ] 并将条件语句放入其中。这会分割数据帧并删除所有不满足给定条件的行。

Syntax:

df[ conditional-statement ]

where,

df: determines the dataframe to be used.
conditional-statement: determines the condition for filtering data.

编程需要懂一点英语

例子：

在这个例子中。删除 x 变量小于零的所有数据点。

R

# create sample data
sample_data <- data.frame( x = rnorm(10),
                          y=rnorm(10,20) )
# print data
print("Sample Data:")
sample_data
  
# filter data
new_data = sample_data[sample_data$x > 0, ]
  
# print data
print("Filtered Data:")
new_data

R

# create sample data
sample_data <- data.frame( x = rnorm(10),
                          y=rnorm(10,20) )
# print data
print("Sample Data:")
sample_data
  
# filter data
new_data = sample_data[sample_data$x > 0 & sample_data$y > 0.4, ]
  
# print data
print("Filtered Data:")
new_data

R

# create sample data
sample_data <- data.frame( x = rnorm(10,20),
                          y=rnorm(10,50) )
# print data
print("Sample Data:")
sample_data
  
# filter data
new_data = subset(sample_data, sample_data$x > 19 & sample_data$y < 49 )
  
# print data
print("Filtered Data:")
new_data

输出：

Sample Data:
           x        y
1   1.0356175 19.36691
2  -0.2071733 21.38060
3  -1.3449463 19.56191
4  -0.5313073 19.49135
5   1.7880192 19.52463
6  -0.7151556 19.93802
7   1.5074344 20.82541
8  -1.0754972 20.59427
9  -0.2483219 19.21103
10 -0.8892829 18.93114
Filtered Data:
        x        y
1 1.035617 19.36691
5 1.788019 19.52463
7 1.507434 20.82541
10  1.0460800 20.05319

方法2：通过多个条件删除行

根据多个条件语句从数据框中删除数据行。我们将方括号[ ] 与数据框一起使用，并在其中放置多个条件语句以及 AND 或 OR运算符。这会对数据帧进行切片并删除所有不满足给定条件的行。

Syntax:

df[ conditional-statement & / | conditional-statement ]

where,

df: determines the dataframe to be used.
conditional-statement: determines the condition for filtering data.

编程需要懂一点英语

例子：

在这个例子中。删除 x 变量小于零且 y 变量小于 19 的所有数据点。

R

# create sample data
sample_data <- data.frame( x = rnorm(10),
                          y=rnorm(10,20) )
# print data
print("Sample Data:")
sample_data
  
# filter data
new_data = sample_data[sample_data$x > 0 & sample_data$y > 0.4, ]
  
# print data
print("Filtered Data:")
new_data

输出：

Sample Data:
              x        y
1  -1.091923406 21.14056
2   0.870826346 20.83627
3   0.285727039 20.89009
4  -0.224661613 20.04137
5   0.653407459 19.01530
6   0.001760769 18.36436
7  -0.572623161 19.72691
8  -0.092852143 19.58567
9  -0.423781311 19.99482
10 -1.332091619 19.36539
Filtered Data:
            x        y
2 0.870826346 20.83627
3 0.285727039 20.89009
5 0.653407459 19.01530
6 0.001760769 18.36436

方法 3：通过子集（）函数删除行

subset()函数根据特定条件创建给定数据帧的子集。这有助于我们使用单个或多个条件语句删除或选择数据行。 subset()函数是 R 语言的内置函数，不需要导入任何第三方包。

Syntax:

subset( df, Conditional-statement )

where,

df: determines the dataframe to be used.
conditional-statement: determines the condition for filtering data.

编程需要懂一点英语

例子：

在这个例子中。使用子集函数删除 x 变量小于 19 且 y 变量大于 50 的所有数据点。

R

# create sample data
sample_data <- data.frame( x = rnorm(10,20),
                          y=rnorm(10,50) )
# print data
print("Sample Data:")
sample_data
  
# filter data
new_data = subset(sample_data, sample_data$x > 19 & sample_data$y < 49 )
  
# print data
print("Filtered Data:")
new_data

输出：

Sample Data:
          x        y
1  20.38324 51.02714
2  20.36595 50.64125
3  20.44204 52.28653
4  20.34413 50.08981
5  20.51478 49.53950
6  20.35667 48.88035
7  19.89415 49.78139
8  21.61003 49.43653
9  20.66579 49.14877
10 20.70246 50.06486
Filtered Data:
         x        y
6 20.35667 48.88035