Python|熊猫 DataFrame.astype()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas 就是其中之一，它使导入和分析数据变得更加容易。

DataFrame.astype()方法用于将 pandas 对象转换为指定的 dtype。 astype()函数还提供将任何合适的现有列转换为分类类型的能力。

当我们想要将特定列数据类型转换为另一种数据类型时， DataFrame.astype()函数非常方便。不仅如此，我们还可以使用Python字典输入来一次更改多个列类型。字典中的键标签对应于列名，字典中的值标签对应于我们希望列的新数据类型。

Syntax: DataFrame.astype(dtype, copy=True, errors=’raise’, **kwargs)

Parameters:
dtype : Use a numpy.dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame’s columns to column-specific types.
copy : Return a copy when copy=True (be very careful setting copy=False as changes to values then may propagate to other pandas objects).

errors : Control raising of exceptions on invalid data for provided dtype.
raise : allow exceptions to be raised
ignore : suppress exceptions. On error return original object

kwargs :keyword arguments to pass on to the constructor

Returns: casted : type of caller

编程需要懂一点英语

有关代码中使用的 CSV 文件的链接，请单击此处

示例 #1：转换权重列数据类型。

# importing pandas as pd
import pandas as pd
  
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
  
# Printing the first 10 rows of 
# the data frame for visualization
  
df[:10]

由于数据有一些“nan”值，所以为了避免任何错误，我们将删除所有包含任何nan值的行。

# drop all those rows which 
# have any 'nan' value in it.
df.dropna(inplace = True)

# let's find out the data type of Weight column
before = type(df.Weight[0])
  
# Now we will convert it into 'int64' type.
df.Weight = df.Weight.astype('int64')
  
# let's find out the data type after casting
after = type(df.Weight[0])
  
# print the value of before
before
  
# print the value of after
after

输出：

# print the data frame and see
# what it looks like after the change
df

示例 #2：一次更改多列的数据类型

将Name列更改为 categorical 类型，将Age列更改为 int64 类型。

# importing pandas as pd
import pandas as pd
  
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
  
# Drop the rows with 'nan' values
df = df.dropna()
  
# print the existing data type of each column
df.info()

输出：

现在让我们一次更改两个列的数据类型。

# Passed a dictionary to astype() function 
df = df.astype({"Name":'category', "Age":'int64'})
  
# Now print the data type 
# of all columns after change
df.info()

输出：

# print the data frame
# too after the change
df

输出：