Python|熊猫 dataframe.reindex_axis()
Python是一种用于进行数据分析的出色语言,主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一,它使导入和分析数据变得更加容易。
Pandas dataframe.reindex_axis()
函数使输入对象与新索引一致。该函数在先前索引中没有值的位置填充NaN
值。它还提供了一种填充数据框中缺失值的方法。除非新索引等同于当前索引并且 copy=False 否则会生成一个新对象
Syntax:
Syntax: DataFrame.reindex_axis(labels, axis=0, method=None, level=None, copy=True, limit=None, fill_value=nan)
Parameters :
labels : New labels / index to conform to. Preferably an Index object to avoid duplicating data
axis : {0 or ‘index’, 1 or ‘columns’}
method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional
copy : Return a new object, even if the passed indexes are the same
level : Broadcast across a level, matching Index values on the passed MultiIndex level
limit : Maximum number of consecutive elements to forward or backward fill
tolerance : Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations most satisfy the equation abs(index[indexer] – target) <= tolerance.
Returns : reindexed : DataFrame
示例 #1:使用reindex_axis()
函数在索引轴上重新索引数据帧。默认情况下,新索引中在数据框中没有相应记录的值被分配为 NaN。
注意:我们可以使用 'ffill' 方法填充缺失值
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({"A":[1, 5, 3, 4, 2],
"B":[3, 2, 4, 3, 4],
"C":[2, 2, 7, 3, 4],
"D":[4, 3, 6, 12, 7]},
index =["A1", "A2", "A3", "A4", "A5"])
# Print the dataframe
df
让我们使用dataframe.reindex_axis()
函数在索引轴上重新索引数据帧
# reindexing with new index values
df.reindex_axis(["A1", "A2", "A4", "A7", "A8"], axis = 0)
输出 :
注意输出,新索引填充了NaN
值,我们可以使用 'ffill' 方法填充缺失值。
# filling the missing values using ffill method
df.reindex_axis(["A1", "A2", "A4", "A7", "A8"],
axis = 0, method ='ffill')
输出 :
请注意,在输出中,新索引已使用“A5”行填充。示例 #2:使用reindex_axis()
函数重新索引列轴
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({"A":[1, 5, 3, 4, 2],
"B":[3, 2, 4, 3, 4],
"C":[2, 2, 7, 3, 4],
"D":[4, 3, 6, 12, 7]},
index =["A1", "A2", "A3", "A4", "A5"])
# reindexing the column axis with
# old and new index values
df.reindex_axis(["A", "B", "D", "E"], axis = 1)
输出 :
注意,我们在重新索引后的新列中有NaN
值,我们可以在重新索引时处理缺失的值。通过使用ffill
方法,我们可以向前填充缺失的值。
# reindex the columns
# we fill the missing values using ffill method
df.reindex_axis(["A", "B", "D", "E"], axis = 1, method ='ffill')
输出 :