Python| Pandas Series.str.extractall()

Series.str可用于以字符串形式访问系列的值并对其应用多种方法。 Pandas Series.str.extractall()函数用于将正则表达式 pat 中的捕获组提取为 DataFrame 中的列。对于系列中的每个主题字符串，从正则表达式 pat 的所有匹配项中提取组。当 Series 中的每个主题字符串都只有一个匹配项时，extractall(pat).xs(0, level='match') 与 extract(pat) 相同。

Syntax: Series.str.extractall(pat, flags=0)

Parameter :
pat : Regular expression pattern with capturing groups.
flags : A re module flag, for example re.IGNORECASE.

Returns : DataFrame

编程需要懂一点英语

示例 #1：使用Series.str.extractall()函数从给定系列对象的基础数据中的字符串中提取所有组。

# importing pandas as pd
import pandas as pd
  
# importing re for regular expressions
import re
  
# Creating the Series
sr = pd.Series(['New_York', 'Lisbon', 'Tokyo', 'Paris', 'Munich'])
  
# Creating the index
idx = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']
  
# set the index
sr.index = idx
  
# Print the series
print(sr)

输出：

现在我们将使用Series.str.extractall()函数从给定系列对象中的字符串中提取所有组。

# extract all groups having a vowel followed by
# any character
result = sr.str.extractall(pat = '([aeiou].)')
  
# print the result
print(result)

输出：

正如我们在输出中看到的， Series.str.extractall()函数返回了一个数据框，其中包含所有提取组的列。

示例 #2：使用Series.str.extractall()函数从给定系列对象的基础数据中的字符串中提取所有组。

# importing pandas as pd
import pandas as pd
  
# importing re for regular expressions
import re
  
# Creating the Series
sr = pd.Series(['Mike', 'Alessa', 'Nick', 'Kim', 'Britney'])
  
# Creating the index
idx = ['Name 1', 'Name 2', 'Name 3', 'Name 4', 'Name 5']
  
# set the index
sr.index = idx
  
# Print the series
print(sr)

输出：

现在我们将使用Series.str.extractall()函数从给定系列对象中的字符串中提取所有组。

# extract all groups having any capital letter
# followed by 'i' and any other character
result = sr.str.extractall(pat = '([A-Z]i.)')
  
# print the result
print(result)

输出：

正如我们在输出中看到的， Series.str.extractall()函数返回了一个数据框，其中包含所有提取组的列。