Python|熊猫 Index.drop_duplicates()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas Index.drop_duplicates()函数返回删除重复值的索引。该函数提供了选择要保留的重复值的灵活性。我们可以从列表中删除所有重复值或保留重复值的第一次/最后一次出现。

Syntax: Index.drop_duplicates(labels, errors=’raise’)

Parameters :
keep : {‘first’, ‘last’, False}, default ‘first’
-> ‘first’ : Drop duplicates except for the first occurrence.
-> ‘last’ : Drop duplicates except for the last occurrence.
-> False : Drop all duplicates.

Returns : deduplicated : Index

编程需要懂一点英语

示例 #1：使用Index.drop_duplicates()函数删除所有重复值的出现，除了第一次出现。

# importing pandas as pd
import pandas as pd
  
# Creating the Index
idx = pd.Index([10, 11, 5, 5, 22, 5, 3, 11])
  
# Print the Index
idx

输出：

让我们删除索引中所有重复值的出现，除了第一次出现。

# drop all duplicate occurrences of the
# labels and keep the first occurrence
idx.drop_duplicates(keep ='first')

输出：

正如我们在输出中看到的那样， Index.drop_duplicate()函数删除了索引中重复出现的标签。示例 #2：使用Index.drop_duplicate()函数删除所有重复出现的标签。不要在索引中保留任何重复的值。

# importing pandas as pd
import pandas as pd
  
# Creating the Index
idx = pd.Index([10, 11, 5, 5, 22, 5, 3, 11])
  
# Print the Index
idx

输出：

让我们删除索引中所有重复值的出现。

# drop all duplicate occurrences of the labels
idx.drop_duplicates(keep = False)

输出：

正如我们在输出中看到的那样，所有重复值都已从索引中删除。