📅  最后修改于: 2020-10-29 01:56:16             🧑  作者: Mango
drop_duplicates()函数执行常见的数据清理任务,该任务处理DataFrame中的重复值。此方法有助于从DataFrame中删除重复的值。
DataFrame.drop_duplicates(subset=None, keep='first', inplace=False)
subset
:它采用一列或列标签列表。它仅考虑用于标识重复项的某些列。默认值无。如果为true,则删除具有重复值的行。
根据传递的参数,它返回删除了重复行的DataFrame。
import pandas as pd
emp = {"Name": ["Parker", "Smith", "William", "Parker"],
"Age": [21, 32, 29, 21]}
info = pd.DataFrame(emp)
print(info)
输出量
Name Age
0 Parker 21
1 Smith 32
2 William 29
3 Parker 21
import pandas as pd
emp = {"Name": ["Parker", "Smith", "William", "Parker"],
"Age": [21, 32, 29, 21]}
info = pd.DataFrame(emp)
info = info.drop_duplicates()
print(info)
输出量
Name Age
0 Parker 21
1 Smith 32
2 William 29