📌  相关文章
📜  使用 R 中的正则表达式替换列中的特定值

📅  最后修改于: 2022-05-13 01:54:48.162000             🧑  作者: Mango

使用 R 中的正则表达式替换列中的特定值

在本文中,我们将讨论如何在 R 编程语言中替换数据帧列中的特定值。

方法 1:使用 sub() 方法

R 编程语言中的 sub() 方法是一种替换方法,用于替换与另一个字符串匹配的模式的任何出现。它可在数据框列或向量上运行。它在大型数据集的情况下特别有用。它可用于替换指定数据框列中由一个或多个单词组成的一个字符或两个字符串。

示例 1:

R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","geek","friends"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- sub("^ge.*", "new_String", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)


R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks for geeks interviews",
                                  "suitable 4 placements",
                                  "interviews placements interviews"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- sub("interviews", "programming", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)


R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub("^\\ge.*", "new_String", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)


R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub(".*^","GFG ",data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)


R
# declaring dataframe
data_frame <- data.frame(col1 = c("geeks12 is good","suitable 4 placements",
                                  "love you 2 much"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub("[0-9]*", "", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)


输出

[1] "Original DataFrame" 
col1
1   geeks 
2     for 
3    geek 
4 friends 
[1] "Modified DataFrame" 
col1 
1 new_String 
2        for 
3 new_String 
4    friends

此方法仅替换主线中第一次出现的指定字符串。

示例 2:

电阻

# declaring dataframe
data_frame <- data.frame(col1 = c("geeks for geeks interviews",
                                  "suitable 4 placements",
                                  "interviews placements interviews"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- sub("interviews", "programming", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)

输出

[1] "Original DataFrame" 
col1 
1       geeks for geeks interviews 
2            suitable 4 placements 
3 interviews placements interviews
[1] "Modified DataFrame"
col1 
1       geeks for geeks programming 
2             suitable 4 placements 
3 programming placements interviews

方法 2:使用 gsub() 方法

gsub() 方法类似于 sub() 方法。但是,它可以使用正则表达式进行替换。它还替换该行中特定单词的所有出现。



示例 1:

电阻

# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub("^\\ge.*", "new_String", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)

输出

[1] "Original DataFrame" 
col1 
1   geeks 
2     for 
3 friends 
4     gap 
5    geek 
[1] "Modified DataFrame" 
col1 
1 new_String 
2        for 
3    friends 
4        gap 
5 new_String

gsub() 方法可用于替换特定列的所有出现。

示例 2:

电阻

# declaring dataframe
data_frame <- data.frame(col1 = c("geeks","for","friends","gap","geek"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub(".*^","GFG ",data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)

输出:

[1] "Original DataFrame" 
col1 
1   geeks 
2     for 
3 friends 
4     gap 
5    geek 
[1] "Modified DataFrame" 
col1 
1   GFG geeks
2     GFG for
3 GFG friends
4     GFG gap
5    GFG geek

它还可用于从值的字符串组件中删除数字。

示例 3:

电阻

# declaring dataframe
data_frame <- data.frame(col1 = c("geeks12 is good","suitable 4 placements",
                                  "love you 2 much"))
  
print ("Original DataFrame")
print (data_frame)
  
data_frame$col1 <- gsub("[0-9]*", "", data_frame$col1)
  
print ("Modified DataFrame")
print (data_frame)

输出:

[1] "Original DataFrame" 
col1 
1       geeks1
2 is good 2 suitable 4 placements 
3       love you 2 much 
[1] "Modified DataFrame" 
col1 
1        geeks is good 
2 suitable  placements 
3       love you  much