Pandas DataFrame 中的重新索引
Pandas 中的重新索引可用于更改 DataFrame 的行和列的索引。索引可以参考与多个pandas系列或pandas DataFrame关联的许多索引DataStructure。让我们看看如何重新索引 Pandas DataFrame 中的列和行。
重新索引行
可以使用 reindex() 方法重新索引单行或多行。新索引中不存在于数据框中的默认值被分配为 NaN。
示例 #1:
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column=['a','b','c','d','e']
index=['A','B','C','D','E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
columns=column, index=index)
print(df1)
print('\n\nDataframe after reindexing rows: \n',
df1.reindex(['B', 'D', 'A', 'C', 'E']))
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column = ['a', 'b', 'c', 'd', 'e']
index = ['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
# create the new index for rows
new_index =['U', 'A', 'B', 'C', 'Z']
print(df1.reindex(new_index))
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column=['a','b','c','d','e']
index=['A','B','C','D','E']
#create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
columns=column, index=index)
column=['e','a','b','c','d']
# create the new index for columns
print(df1.reindex(column, axis='columns'))
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns'))
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value = 1.5))
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value ='data missing'))
输出:
示例 #2:
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column = ['a', 'b', 'c', 'd', 'e']
index = ['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
# create the new index for rows
new_index =['U', 'A', 'B', 'C', 'Z']
print(df1.reindex(new_index))
输出:
使用轴关键字重新索引列
通过使用 reindex() 方法并指定我们要重新索引的轴,可以重新索引单个列或多个列。新索引中不存在于数据框中的默认值被分配为 NaN。
示例 #1:
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column=['a','b','c','d','e']
index=['A','B','C','D','E']
#create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5,5),
columns=column, index=index)
column=['e','a','b','c','d']
# create the new index for columns
print(df1.reindex(column, axis='columns'))
输出:
示例 #2:
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns'))
输出:
替换缺失值
代码 #1:可以通过将值传递给关键字 fill_value 来填充数据框中的缺失值。此关键字替换 NaN 值。
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value = 1.5))
输出:
代码 #2:用字符串替换缺失的数据。
Python3
# import numpy and pandas module
import pandas as pd
import numpy as np
column =['a', 'b', 'c', 'd', 'e']
index =['A', 'B', 'C', 'D', 'E']
# create a dataframe of random values of array
df1 = pd.DataFrame(np.random.rand(5, 5),
columns = column, index = index)
column =['a', 'b', 'c', 'g', 'h']
# create the new index for columns
print(df1.reindex(column, axis ='columns', fill_value ='data missing'))
输出: