在 R DataFrame 中查找带有 NA 的列和行
数据框包含单元格,称为数据元素,以行和列的表格形式排列。一个数据框可以包含属于不同数据类型的数据元素以及缺失值,用 NA 表示。
方法
- 声明数据框
- 使用函数获取值来获取 NA 值
- 店铺位置
- 显示结果
R 中的以下内置函数可共同用于查找数据框中具有 NA 值的行和列对。 is.na()函数返回 True 和 False 值的逻辑向量,以指示哪些相应元素是 NA 或不是。接下来是应用 which()函数指示数据元素的位置。以下代码片段可用于查找此类元素索引位置。
句法:
which(is.na(dataframe), arr.ind=TRUE)
例子:
R
# declaring data frame
data_frame = data.frame(
col1 = c(1,NA),
col2 = c(7:8),
col3 = c(NA,NA))
# printing original data frame
print ("Original Data Frame")
print(data_frame)
# extracting positions of NA values
print ("Row and Col positions of NA values")
which(is.na(data_frame), arr.ind=TRUE)
R
# declaring data frame
data_frame = data.frame(
col1 = c("A",NA,"B"),
col2 = c(100:102),
col3 = c(NA,NA,9))
# printing original data frame
print ("Original Data Frame")
print(data_frame)
# finding NA values beginning with row1 and col1 as the
# first element. Rows2 and col2 is second element.
print ("Row and Col positions of NA values")
which(is.na(data_frame))
R
# declaring data frame
data_frame = data.frame(
col1 = c("A",NA,"B"),
col2 = c(100:102),
col3 = c(NA,NA,9))
# printing original data frame
print ("Original Data Frame")
print(data_frame)
# extracting positions of NA values
print ("NA values in column 1")
which(is.na(data_frame$col1), arr.ind=TRUE)
# extracting positions of NA values
print ("NA values in column 2")
which(is.na(data_frame$col2), arr.ind=TRUE)
输出
[1] "Original Data Frame"
col1 col2 col3
1 1 7 NA
2 NA 8 NA
[1] "Row and Col positions of NA values"
row col
[1,] 2 1
[2,] 1 3
[3,] 2 3
如果我们不指定 arr,ind=TRUE 作为参数,则返回按行计数的元素编号。
例子:
电阻
# declaring data frame
data_frame = data.frame(
col1 = c("A",NA,"B"),
col2 = c(100:102),
col3 = c(NA,NA,9))
# printing original data frame
print ("Original Data Frame")
print(data_frame)
# finding NA values beginning with row1 and col1 as the
# first element. Rows2 and col2 is second element.
print ("Row and Col positions of NA values")
which(is.na(data_frame))
输出
[1] "Original Data Frame"
col1 col2 col3
1 A 100 NA
2 101 NA
3 B 102 9
[1] "Row and Col positions of NA values"
[1] 2 7 8
通过使用 dataframe$colname 作为上述代码片段中的参数访问数据帧的特定列,也可以在列内单独计算缺失值。如果特定列中不存在 NA 值,则返回 integer(0) 作为输出。
例子:
电阻
# declaring data frame
data_frame = data.frame(
col1 = c("A",NA,"B"),
col2 = c(100:102),
col3 = c(NA,NA,9))
# printing original data frame
print ("Original Data Frame")
print(data_frame)
# extracting positions of NA values
print ("NA values in column 1")
which(is.na(data_frame$col1), arr.ind=TRUE)
# extracting positions of NA values
print ("NA values in column 2")
which(is.na(data_frame$col2), arr.ind=TRUE)
输出
[1] "Original Data Frame"
col1 col2 col3
1 A 100 NA
2 101 NA
3 B 102 9
[1] "NA values in column 1"
[1] 2
[1] "NA values in column 2"
integer(0)