pandas 过滤字符串包含 - Python (1) - 芒果文档

📌 相关文章

📜 pandas 过滤字符串包含 - Python (1)

📅 最后修改于: 2023-12-03 15:18:15.590000 🧑 作者: Mango

pandas 过滤字符串包含 - Python

在数据处理中，经常需要对包含某些特定字符串的数据行进行过滤，这时候就要用到 Pandas 的字符串方法。Pandas 提供了类似于 Python 字符串内置方法的字符串方法，可以用于对 Pandas 数据框中的字符串进行操作和过滤。

过滤包含特定字符串的行

如果要过滤 Pandas 数据框中某一列中包含特定字符串的行，可以使用 .str.contains() 方法。

其中，.str 用于访问数据框中的字符串列，.contains() 用于判断字符串是否包含某个子字符串。

例如，假设有以下数据框 df：

| name | |---------------| | John Appleseed| | Apple | | Jane Doe | | John Doe |

如果要过滤出 name 列中包含子字符串 "John" 或 "Doe" 的行，可以这样做：

df[df['name'].str.contains("John|Doe")]

运行结果如下所示：

| name | |--------------| | John Appleseed| | Jane Doe | | John Doe |

其中，| 用于表示或运算，即同时包含 "John" 或 "Doe" 的行都会被保留。

忽略大小写过滤

有时候需要忽略字符串大小写进行过滤，可以使用 .str.contains() 方法的 case 参数。

例如，假设有以下数据框 df：

| name | |---------------| | John Appleseed| | APPLE | | Jane Doe | | John Doe |

如果要过滤出 name 列中包含子字符串 "apple" 的行，可以这样做：

df[df['name'].str.contains("apple", case=False)]

运行结果如下所示：

| name | |---------------| | John Appleseed| | APPLE |

其中，case=False 表示不区分大小写进行过滤。

只过滤列名

有时候需要只过滤列名中包含特定字符串的列，可以使用 .filter() 方法。

例如，假设有以下数据框 df：

| name | city | |-------|-----------------| | Alice | San Francisco | | Bob | New York City | | Carol | Los Angeles, CA |

如果要过滤出列名中包含 "city" 的列，可以这样做：

df.filter(like='city')

运行结果如下所示：

| city | |----------------| | San Francisco | | New York City | | Los Angeles, CA|

其中，like='city' 表示只过滤列名中包含 "city" 的列。