📜  如何在 Pandas 中使用“NOT IN”过滤器?

📅  最后修改于: 2022-05-13 01:55:21.893000             🧑  作者: Mango

如何在 Pandas 中使用“NOT IN”过滤器?

在本文中,我们将讨论 Pandas 中的 NOT IN 过滤器,NOT IN 是一个成员运算符,用于检查数据是否存在于 dataframe 中。如果该值不存在,它将返回 true,否则返回 false

让我们创建一个示例数据框

Python3
# import pandas module
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# display
data1


Python3
# import pandas module
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika']
  
# filter in name column
print(data1[~data1['name'].isin(list1)])
print("============")
  
# consider a list
list2 = ['R']
  
  
# filter in name column
print(data1[~data1['subject1'].isin(list2)])
print("============")
  
# consider a list
list3 = [96, 89]
  
# filter in name column
print(data1[~data1['marks'].isin(list3)])


Python3
# import pandas module
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika', 96]
  
# filter in name and marks column
print(data1[~data1[['name', 'marks']].isin(list1).any(axis=1)])
print("============")
  
# consider a list
list2 = ['R', 'sravan']
  
# filter in name and subject1 column
print(data1[~data1[['subject1', 'name']].isin(list2).any(axis=1)])


Python3
# import pandas module
import numpy as np
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika', 96]
  
# filter in name column
data1[~np.isin(data1['name'], list1)]


输出:

样本数据框

方法 1:对一列使用 NOT IN 过滤器

我们正在使用 isin()运算符来获取数据框中的给定值,并且这些值是从列表中获取的,因此我们正在过滤该列表中存在的数据框一列值。

Python3

# import pandas module
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika']
  
# filter in name column
print(data1[~data1['name'].isin(list1)])
print("============")
  
# consider a list
list2 = ['R']
  
  
# filter in name column
print(data1[~data1['subject1'].isin(list2)])
print("============")
  
# consider a list
list3 = [96, 89]
  
# filter in name column
print(data1[~data1['marks'].isin(list3)])

输出:

NOT IN 过滤器只有一列

方法 2:对多列使用 NOT IN 过滤器

现在我们可以使用 any()函数过滤多个列。此函数将检查任何给定列中存在的值,并且列在 [[]] 中以逗号分隔。

Python3

# import pandas module
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika', 96]
  
# filter in name and marks column
print(data1[~data1[['name', 'marks']].isin(list1).any(axis=1)])
print("============")
  
# consider a list
list2 = ['R', 'sravan']
  
# filter in name and subject1 column
print(data1[~data1[['subject1', 'name']].isin(list2).any(axis=1)])

输出:

NOT IN 多列过滤器

方法 3:使用带有 NOT IN 过滤器的 numpy

这类似于上面的功能。

Python3

# import pandas module
import numpy as np
import pandas as pd
  
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
  
# consider a list
list1 = ['harsha', 'jyothika', 96]
  
# filter in name column
data1[~np.isin(data1['name'], list1)]

输出:

带有 NOT IN 过滤器的 numpy