Python|熊猫系列.str.findall()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas str.findall()方法还用于在系列中的每个字符串中查找子字符串或分隔符。但它与 str.find() 方法不同。它不返回索引，而是返回带有子字符串的列表，列表的大小是它发生的次数。

Syntax: Series.str.findall(pat, flags=0)

Parameters:
pat: Substring to be searched for
flags: Regex flags that can be passed (A, S, L, M, I, X), default is 0 which means None. For this regex module (re) has to be imported too.

Return Type: Series of list(strings).

编程需要懂一点英语

要下载代码中使用的 CSV，请单击此处。在以下示例中，使用的数据框包含一些 NBA 球员的数据。下面附上任何操作之前的数据帧图像。

示例 #1：在字符串中搜索字符

在此示例中，使用 str.findall() 方法在名称列中搜索“r”，并将输出存储在新列中。在执行任何操作之前，使用 .dropna() 删除空行以避免错误。

# importing pandas module 
import pandas as pd 
    
# making data frame 
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 
    
# removing null values to avoid errors 
data.dropna(inplace = True) 
  
# string to be searched for
search ='r'
  
# returning values and creating column
data["Findall(name)"]= data["Name"].str.findall(search)
  
# display
data.head(10)

输出：
如输出图像所示，可以比较返回的 'e' 的数量等于它在字符串中出现的次数。
示例 #2：搜索字符并传递 IGNORECASE 标志

在此示例中，在 Name 列中搜索“a”并传递 IGNORECASE 标志。因为那个 re 模块也必须被导入。 str.findall() 方法返回的系列存储在一个新列中。

# importing pandas module 
import pandas as pd 
  
# importing regex module
import re
    
# making data frame 
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 
    
# removing null values to avoid errors 
data.dropna(inplace = True) 
  
# string to be searched for
search ='a'
  
# returning values and creating column
data["Findall(name)"]= data["Name"].str.findall(search, flags = re.I)
  
# display
data.head(10)

输出：
如输出图像所示，在第一行本身可以看出，自从传递了 IGNORECASE 标志（re.I）以来，“A”和“a”都被返回。