📜  连接不重复的 Pandas DataFrames

📅  最后修改于: 2022-05-13 01:55:37.473000             🧑  作者: Mango

连接不重复的 Pandas DataFrames

在本文中,我们将使用pandas模块连接两个数据帧。

为了执行两个数据帧的连接,我们将使用pandas模块中的pandas.concat().drop_duplicates()方法。

循序渐进的方法:

  • 导入模块。
  • 加载两个示例数据帧作为变量。
  • 使用连接数据帧 pandas.concat().drop_duplicates() 方法。
  • 显示生成的新数据框。

下面是一些示例,它们描述了如何使用Pandas模块在两个数据帧之间执行没有重复的连接:

示例 1:

Python3
# Importing pandas library
import pandas as pd
 
# loading dataframes
dataframe1 = pd.DataFrame({'columnA': [20, 30, 40],
                           'columnB': [200, 300, 400]})
 
dataframe2 = pd.DataFrame({'columnA': [50, 20, 60],
                           'columnB': [500, 200, 600]})
 
# Concatenating dataframes without duplicates
new_dataframe = pd.concat([dataframe1, dataframe2]).drop_duplicates()
 
# Display concatenated dataframe
new_dataframe


Python3
# Importing pandas library
import pandas as pd
 
# loading dataframes
dataframe1 = pd.DataFrame({'name': ['rahul', 'anjali', 'kajal'],
                           'age': [23, 28, 30]})
 
dataframe2 = pd.DataFrame({'name': ['devesh', 'rashi', 'anjali'],
                           'age': [20, 15, 28]})
 
# Concatenating two dataframes wtithout duplicates
new_dataframe = pd.concat([dataframe1, dataframe2]).drop_duplicates()
 
# Resetting index
new_dataframe = new_dataframe.reset_index(drop=True)
 
# Display dataframe generated
new_dataframe


Python3
# Importing pandas libraray
import pandas as pd
 
# Loading dataframes
dataframe1 = pd.DataFrame({'empname': ['rohan', 'hina', 'alisa', ],
                           'department': ['IT', 'admin', 'finance', ],
                           'designation': ['Sr.developer', 'administrator', 'executive', ]})
 
dataframe2 = pd.DataFrame({'empname': ['rishi', 'huma', 'alisa', ],
                           'department': ['cyber security', 'HR', 'finance', ],
                           'designation': ['penetration tester', 'HR executive', 'executive', ]})
 
# Concatenating two dataframes wtithout duplicates
new_dataframe = pd.concat([dataframe1, dataframe2]).drop_duplicates()
 
# Resetting index
new_dataframe = new_dataframe.reset_index(drop=True)
 
# Display dataframe generated
new_dataframe


输出:

在这里,我们使用pandas.concat()方法连接了两个数据帧。

示例 2:

蟒蛇3

# Importing pandas library
import pandas as pd
 
# loading dataframes
dataframe1 = pd.DataFrame({'name': ['rahul', 'anjali', 'kajal'],
                           'age': [23, 28, 30]})
 
dataframe2 = pd.DataFrame({'name': ['devesh', 'rashi', 'anjali'],
                           'age': [20, 15, 28]})
 
# Concatenating two dataframes wtithout duplicates
new_dataframe = pd.concat([dataframe1, dataframe2]).drop_duplicates()
 
# Resetting index
new_dataframe = new_dataframe.reset_index(drop=True)
 
# Display dataframe generated
new_dataframe

输出:

如输出图像所示,我们在不删除重复项的情况下获得了数据帧的连接。

示例 3:

蟒蛇3

# Importing pandas libraray
import pandas as pd
 
# Loading dataframes
dataframe1 = pd.DataFrame({'empname': ['rohan', 'hina', 'alisa', ],
                           'department': ['IT', 'admin', 'finance', ],
                           'designation': ['Sr.developer', 'administrator', 'executive', ]})
 
dataframe2 = pd.DataFrame({'empname': ['rishi', 'huma', 'alisa', ],
                           'department': ['cyber security', 'HR', 'finance', ],
                           'designation': ['penetration tester', 'HR executive', 'executive', ]})
 
# Concatenating two dataframes wtithout duplicates
new_dataframe = pd.concat([dataframe1, dataframe2]).drop_duplicates()
 
# Resetting index
new_dataframe = new_dataframe.reset_index(drop=True)
 
# Display dataframe generated
new_dataframe

输出:

这是另一个示例,它描述了如何连接两个数据帧。

示例 3 的输出数据集