如何修复:无效的因子水平,在 R 中生成 NA
在本文中,我们将通过示例查看修复错误的方法:无效因子水平,NA 生成。
当程序员尝试向 R 中的因子变量添加一个值时,编译器会产生这种类型的警告消息,而该因子变量在预先定义的级别上并不存在。完整的警告信息如下:
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
invalid factor level, NA generated
何时可能发生错误
让我们创建一个数据框。
R
# Create a data frame
dataframe < - data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# Display the data frame
dataframe
# Display the structure of the data frame
str(dataframe)
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
#add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Display the dataframe
dataframe
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the data frame
dataframe
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the structure of the data frame
str(dataframe)
输出:
在本例中,team 变量只有三种类型的值:“Alpha”、“Beta”、“Charlie”。现在,我们将尝试在数据框的末尾插入一个额外的行,其团队名称等于“Gamma”。
例子:
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
#add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
输出:
编译器产生警告消息。这是因为“Gamma”值尚未出现在团队列下。请注意,这只是一条警告消息,编译器将自动在数据帧的末尾插入一个新行,但单元格的值将等于 NA,而不是“Gamma”。
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Display the dataframe
dataframe
输出:
如何避免警告:
我们可以通过首先将因子变量转换为字符变量来消除此警告,然后我们可以在添加附加行后将其再次转换为因子变量。
例子:
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the data frame
dataframe
输出:
正如您在输出中看到的那样,警告以及“NA”的内容已从数据框中消除。现在让我们显示一次修改后的数据帧的结构:
R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
'Beta', 'Beta',
'Charlie', 'Charlie',
'Charlie')),
points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the structure of the data frame
str(dataframe)
输出: