Pandas DataFrame 中的重新索引

Pandas 中的重新索引可用于更改 DataFrame 的行和列的索引。索引可以参考与多个pandas系列或pandas DataFrame关联的许多索引DataStructure。让我们看看如何重新索引 Pandas DataFrame 中的列和行。

重新索引行

可以使用 reindex() 方法重新索引单行或多行。新索引中不存在于数据框中的默认值被分配为 NaN。

示例 #1：

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column=['a','b','c','d','e']
index=['A','B','C','D','E']
 
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
            columns=column, index=index)
 
print(df1)
 
print('\n\nDataframe after reindexing rows: \n',
df1.reindex(['B', 'D', 'A', 'C', 'E']))

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column = ['a', 'b', 'c', 'd', 'e']
index = ['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
# create the new index for rows
new_index =['U', 'A', 'B', 'C', 'Z']
 
print(df1.reindex(new_index))

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column=['a','b','c','d','e']
index=['A','B','C','D','E']
 
#create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
           columns=column, index=index)
 
column=['e','a','b','c','d']
  
# create the new index for columns
print(df1.reindex(column, axis='columns'))

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns'))

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value = 1.5))

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
       columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value ='data missing'))

输出：

示例 #2：

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column = ['a', 'b', 'c', 'd', 'e']
index = ['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
# create the new index for rows
new_index =['U', 'A', 'B', 'C', 'Z']
 
print(df1.reindex(new_index))

输出：

使用轴关键字重新索引列

通过使用 reindex() 方法并指定我们要重新索引的轴，可以重新索引单个列或多个列。新索引中不存在于数据框中的默认值被分配为 NaN。

示例 #1：

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column=['a','b','c','d','e']
index=['A','B','C','D','E']
 
#create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
           columns=column, index=index)
 
column=['e','a','b','c','d']
  
# create the new index for columns
print(df1.reindex(column, axis='columns'))

输出：

示例 #2：

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns'))

输出：

替换缺失值

代码 #1：可以通过将值传递给关键字 fill_value 来填充数据框中的缺失值。此关键字替换 NaN 值。

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
        columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value = 1.5))

输出：

代码 #2：用字符串替换缺失的数据。

Python3

# import numpy and pandas module
import pandas as pd
import numpy as np
 
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
  
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
       columns = column, index = index)
 
column =['a', 'b', 'c', 'g', 'h']
 
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value ='data missing'))

输出：