重命名 Pandas 中的特定列
给定一个 pandas 数据框,让我们看看如何使用各种方法重命名特定列的名称。
首先,让我们创建一个数据框:
Python3
# import pandas package
import pandas as pd
# defining a dictionary
d = {"Name": ["John", "Mary", "Helen"],
"Marks": [95, 75, 99],
"Roll No": [12, 21, 9]}
# creating the pandas data frame
df = pd.DataFrame(d)
df
Python3
# import pandas package
import pandas as pd
# defining a dictionary
d = {"Name": ["John", "Mary", "Helen"],
"Marks": [95, 75, 99],
"Roll No": [12, 21, 9]}
# creating the pandas data frame
df = pd.DataFrame(d)
# displaying the columns
# before renaming
print(df.columns)
# renaming the column "A"
df.rename(columns = {"Name": "Names"},
inplace = True)
# displaying the columns after renaming
print(df.columns)
Python3
# import pandas package
import pandas as pd
# defining a dictionary
d = {"Name": ["John", "Mary", "Helen"],
"Marks": [95, 75, 99],
"Roll No": [12, 21, 9]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns before renaming
print(df.columns)
# renaming the columns
df.rename({"Name": "Student Name",
"Marks": "Marks Obtained",
"Roll No": "Roll Number"},
axis = "columns", inplace = True)
# displaying the columns after renaming
print(df.columns)
Python3
# using the same modified dataframe
# df from Renaming Multiple Columns
# this adds ':' at the end
# of each column name
df = df.rename(columns = lambda x: x+':')
# printing the columns
print(df.columns)
Python3
# using the same modified dataframe
# df from Renaming Multiple Columns
# Renaming the third column
df.columns.values[2] = "Roll Number"
# printing the columns
print(df.columns)
Python3
# Creating a list of new columns
df_cols = ["Student Name",
"Marks Obtained",
"Roll Number"]
# printing the columns
# before renaming
print(df.columns)
# Renaming the columns
df.columns = df_cols
# printing the columns
# after renaming
print(df.columns)
Python3
# printing the column
# names before renaming
print(df.columns)
# Replacing the space in column
# names by an underscore
df.columns = df.columns.str.replace(' ', '_')
# printing the column names
# after renaming
print(df.columns)
Python3
# NO ERROR IS RAISED
# import pandas package
import pandas as pd
# defining a dictionary
d = {"A": [1, 2, 3],
"B": [4, 5, 6]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns before renaming
print(df.columns)
# renaming the columns
# column "C" is not in
# the original dataframe
# errors parameter is
# set to 'ignore' by default
df.rename(columns = {"A": "a", "B": "b",
"C": "c"},
inplace = True)
# displaying the columns
# after renaming
print(df.columns)
Python3
# ERROR IS RAISED
# import pandas package
import pandas as pd
# defining a dictionary
d = {"A": [1, 2, 3],
"B": [4, 5, 6]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns
# before renaming
print(df.columns)
# renaming the columns
# column "C" is not in the
# original dataframe setting
# the errors parameter to 'raise'
df.rename(columns = {"A": "a", "B": "b",
"C": "c"},
inplace = True, errors = 'raise')
# displaying the columns
# after renaming
print(df.columns)
输出:
方法 1:使用Dataframe.rename() 。
此方法是一种在 Pandas 中重命名所需列的方法。它允许我们以字典的形式指定要更改的列名称,其中键和值作为相应列的当前名称和新名称。
示例 1:重命名单个列。
Python3
# import pandas package
import pandas as pd
# defining a dictionary
d = {"Name": ["John", "Mary", "Helen"],
"Marks": [95, 75, 99],
"Roll No": [12, 21, 9]}
# creating the pandas data frame
df = pd.DataFrame(d)
# displaying the columns
# before renaming
print(df.columns)
# renaming the column "A"
df.rename(columns = {"Name": "Names"},
inplace = True)
# displaying the columns after renaming
print(df.columns)
输出:
示例 2:重命名多个列。
Python3
# import pandas package
import pandas as pd
# defining a dictionary
d = {"Name": ["John", "Mary", "Helen"],
"Marks": [95, 75, 99],
"Roll No": [12, 21, 9]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns before renaming
print(df.columns)
# renaming the columns
df.rename({"Name": "Student Name",
"Marks": "Marks Obtained",
"Roll No": "Roll Number"},
axis = "columns", inplace = True)
# displaying the columns after renaming
print(df.columns)
输出:
示例 3:传递lambda函数以重命名列。
Python3
# using the same modified dataframe
# df from Renaming Multiple Columns
# this adds ':' at the end
# of each column name
df = df.rename(columns = lambda x: x+':')
# printing the columns
print(df.columns)
输出:
lambda函数是一个小型匿名函数,它可以接受任意数量的参数,但只能有一个表达式。如果我们必须一次修改所有列,我们可以使用它。如果列的数量很大,这很有用,并且使用列表或字典重命名它们不是一件容易的事(很多代码,唷!)。在上面的示例中,我们使用 lambda函数在每个列名的末尾添加一个冒号 (':')。
方法 2:使用values 属性。
我们可以在要重命名的列上使用 values 属性并直接更改它。
Python3
# using the same modified dataframe
# df from Renaming Multiple Columns
# Renaming the third column
df.columns.values[2] = "Roll Number"
# printing the columns
print(df.columns)
输出:
方法 3:使用新的列名列表。
我们将更新的列名作为列表传递以重命名列。我们提供的列表的长度应该与数据框中的列数相同。否则,会发生错误。
Python3
# Creating a list of new columns
df_cols = ["Student Name",
"Marks Obtained",
"Roll Number"]
# printing the columns
# before renaming
print(df.columns)
# Renaming the columns
df.columns = df_cols
# printing the columns
# after renaming
print(df.columns)
输出:
方法 4:使用 Dataframe.columns.str.replace () 。
一般来说,如果 Pandas 数据框中的列数很大,比如接近 100,我们想用下划线替换所有列名(如果存在)中的空格。提供一个列表或字典来重命名所有列并不容易。因此,我们使用如下方法——
Python3
# printing the column
# names before renaming
print(df.columns)
# Replacing the space in column
# names by an underscore
df.columns = df.columns.str.replace(' ', '_')
# printing the column names
# after renaming
print(df.columns)
输出:
此外,其他字符串方法(例如str.lower)可用于将所有列名设为小写。
注意:假设列名不存在于原始数据框中,但存在于为重命名列而提供的字典中。默认情况下,rename()函数的errors参数的值为“ignore”。因此,不会显示错误,并且现有的列会按照指示重命名。相反,如果我们将errors参数设置为“raise”,则会引发错误,指出原始数据框中不存在特定列。
下面是一个相同的例子:
示例 1:没有引发错误,因为默认错误设置为“忽略”。
Python3
# NO ERROR IS RAISED
# import pandas package
import pandas as pd
# defining a dictionary
d = {"A": [1, 2, 3],
"B": [4, 5, 6]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns before renaming
print(df.columns)
# renaming the columns
# column "C" is not in
# the original dataframe
# errors parameter is
# set to 'ignore' by default
df.rename(columns = {"A": "a", "B": "b",
"C": "c"},
inplace = True)
# displaying the columns
# after renaming
print(df.columns)
输出:
示例 2:将参数错误设置为“raise”。引发错误(原始数据框中不存在 C 列。)
Python3
# ERROR IS RAISED
# import pandas package
import pandas as pd
# defining a dictionary
d = {"A": [1, 2, 3],
"B": [4, 5, 6]}
# creating the pandas dataframe
df = pd.DataFrame(d)
# displaying the columns
# before renaming
print(df.columns)
# renaming the columns
# column "C" is not in the
# original dataframe setting
# the errors parameter to 'raise'
df.rename(columns = {"A": "a", "B": "b",
"C": "c"},
inplace = True, errors = 'raise')
# displaying the columns
# after renaming
print(df.columns)
输出: