如何在 Pandas 中找到两行之间的差异?
在本文中,我们将了解如何在 Pandas 中找到两行之间的差异。
Pandas DataFrame 是一种带有标记轴的表格形式的二维数据结构。在数据分析期间,可能需要计算两行之间的差异以进行比较。这可以使用pandas.DataFrame.diff()函数来完成。此函数计算两个连续 DataFrame 元素之间的差异。
Syntax: pandas.DataFrame.diff(periods=1, axis=0)
Parameters:
- periods: Represents periods to shift for computing difference, Integer type value. Default value is 1
- axis: Represents difference to be taken over rown or columns. Can take two values {0: rows, 1: columns}. Default value is 0
Returns: Returns DataFrame
示例 1:为了测试这个函数,我们创建了一个 3 列 6 行的虚拟 DataFrame。现在,这个 diff函数将找出每一行与其前一行的差异,默认情况下,周期为 1。因此,第 0 个索引行中的值是 NaN。
Python3
# Importing Pandas Library
import pandas as pd
# Creating dummy DataFrame for testing
df = pd.DataFrame({ 'a': [1, 2, 3, 4, 5, 6],
'b': [8, 18, 27, 20, 33, 49],
'c': [2, 24, 6, 16, 20, 52]})
# Printing DataFrame before applying diff function
print(df)
# Printing DataFrame after applying diff function
print("Difference: ")
print(df.diff())
Python3
# Importing Pandas Library
import pandas as pd
# Creating dummy DataFrame for testing
df = pd.DataFrame({ 'a': [1, 2, 3, 4, 5, 6],
'b': [8, 18, 27, 20, 33, 49],
'c': [2, 24, 6, 16, 20, 52]})
# Printing DataFrame before applying diff function
print(df)
# Printing DataFrame after applying diff function
print("Difference: ")
print(df.diff(periods=2))
输出:
a b c
0 1 8 2
1 2 18 24
2 3 27 6
3 4 20 16
4 5 33 20
5 6 49 52
Difference:
a b c
0 NaN NaN NaN
1 1.0 10.0 22.0
2 1.0 9.0 -18.0
3 1.0 -7.0 10.0
4 1.0 13.0 4.0
5 1.0 16.0 32.0
示例 2:
Python3
# Importing Pandas Library
import pandas as pd
# Creating dummy DataFrame for testing
df = pd.DataFrame({ 'a': [1, 2, 3, 4, 5, 6],
'b': [8, 18, 27, 20, 33, 49],
'c': [2, 24, 6, 16, 20, 52]})
# Printing DataFrame before applying diff function
print(df)
# Printing DataFrame after applying diff function
print("Difference: ")
print(df.diff(periods=2))
输出:
a b c
0 1 8 2
1 2 18 24
2 3 27 6
3 4 20 16
4 5 33 20
5 6 49 52
Difference:
a b c
0 NaN NaN NaN
1 NaN NaN NaN
2 2.0 19.0 4.0
3 2.0 2.0 -8.0
4 2.0 6.0 14.0
5 2.0 29.0 36.0