📜  如何修复:无效的因子水平,在 R 中生成 NA

📅  最后修改于: 2022-05-13 01:54:33.253000             🧑  作者: Mango

如何修复:无效的因子水平,在 R 中生成 NA

在本文中,我们将通过示例查看修复错误的方法:无效因子水平,NA 生成。

当程序员尝试向 R 中的因子变量添加一个值时,编译器会产生这种类型的警告消息,而该因子变量在预先定义的级别上并不存在。完整的警告信息如下:

Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
  invalid factor level, NA generated 

何时可能发生错误

让我们创建一个数据框。

R
# Create a data frame
dataframe < - data.frame(team=factor(c('Alpha', 'Alpha',
                                       'Beta', 'Beta',
                                       'Charlie', 'Charlie',
                                       'Charlie')),
                         points=c(96, 91, 86, 89, 93, 87, 91))
  
# Display the data frame
dataframe
  
# Display the structure of the data frame
str(dataframe)


R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha', 
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
#add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)


R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Display the dataframe
dataframe


R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta', 
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
  
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
  
# Display the data frame
dataframe


R
# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha', 
                                      'Beta', 'Beta', 
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
  
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
  
# Display the structure of the data frame
str(dataframe)


输出:

在本例中,team 变量只有三种类型的值:“Alpha”、“Beta”、“Charlie”。现在,我们将尝试在数据框的末尾插入一个额外的行,其团队名称等于“Gamma”。

例子:

R

# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha', 
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
#add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)

输出:

输出

编译器产生警告消息。这是因为“Gamma”值尚未出现在团队列下。请注意,这只是一条警告消息,编译器将自动在数据帧的末尾插入一个新行,但单元格的值将等于 NA,而不是“Gamma”。

R

# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Display the dataframe
dataframe

输出:

如何避免警告:

我们可以通过首先将因子变量转换为字符变量来消除此警告,然后我们可以在添加附加行后将其再次转换为因子变量。

例子:

R

# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta', 
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
  
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
  
# Display the data frame
dataframe

输出:

正如您在输出中看到的那样,警告以及“NA”的内容已从数据框中消除。现在让我们显示一次修改后的数据帧的结构:

R

# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha', 
                                      'Beta', 'Beta', 
                                      'Charlie', 'Charlie',
                                      'Charlie')),
                 points=c(96, 91, 86, 89, 93, 87, 91))
  
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
  
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
  
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
  
# Display the structure of the data frame
str(dataframe)

输出: