Python|熊猫 dataframe.reindex_axis()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas dataframe.reindex_axis()函数使输入对象与新索引一致。该函数在先前索引中没有值的位置填充NaN值。它还提供了一种填充数据框中缺失值的方法。除非新索引等同于当前索引并且 copy=False 否则会生成一个新对象

Syntax:
Syntax: DataFrame.reindex_axis(labels, axis=0, method=None, level=None, copy=True, limit=None, fill_value=nan)

Parameters :
labels : New labels / index to conform to. Preferably an Index object to avoid duplicating data
axis : {0 or ‘index’, 1 or ‘columns’}
method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional
copy : Return a new object, even if the passed indexes are the same
level : Broadcast across a level, matching Index values on the passed MultiIndex level
limit : Maximum number of consecutive elements to forward or backward fill
tolerance : Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations most satisfy the equation abs(index[indexer] – target) <= tolerance.

Returns : reindexed : DataFrame

编程需要懂一点英语

示例 #1：使用reindex_axis()函数在索引轴上重新索引数据帧。默认情况下，新索引中在数据框中没有相应记录的值被分配为 NaN。
注意：我们可以使用 'ffill' 方法填充缺失值

# importing pandas as pd
import pandas as pd
  
# Creating the dataframe 
df = pd.DataFrame({"A":[1, 5, 3, 4, 2], 
                   "B":[3, 2, 4, 3, 4],
                   "C":[2, 2, 7, 3, 4],
                   "D":[4, 3, 6, 12, 7]},
                   index =["A1", "A2", "A3", "A4", "A5"])
  
# Print the dataframe
df

让我们使用dataframe.reindex_axis()函数在索引轴上重新索引数据帧

# reindexing with new index values
df.reindex_axis(["A1", "A2", "A4", "A7", "A8"], axis = 0)

输出：

注意输出，新索引填充了NaN值，我们可以使用 'ffill' 方法填充缺失值。

# filling the missing values using ffill method
df.reindex_axis(["A1", "A2", "A4", "A7", "A8"], 
                     axis = 0, method ='ffill')

输出：

请注意，在输出中，新索引已使用“A5”行填充。示例 #2：使用reindex_axis()函数重新索引列轴

# importing pandas as pd
import pandas as pd
  
# Creating the dataframe 
df = pd.DataFrame({"A":[1, 5, 3, 4, 2],
                   "B":[3, 2, 4, 3, 4],
                   "C":[2, 2, 7, 3, 4],
                   "D":[4, 3, 6, 12, 7]}, 
                   index =["A1", "A2", "A3", "A4", "A5"])
  
# reindexing the column axis with
# old and new index values
df.reindex_axis(["A", "B", "D", "E"], axis = 1)

输出：

注意，我们在重新索引后的新列中有NaN值，我们可以在重新索引时处理缺失的值。通过使用ffill方法，我们可以向前填充缺失的值。

# reindex the columns
# we fill the missing values using ffill method
df.reindex_axis(["A", "B", "D", "E"], axis = 1, method ='ffill')

输出：