在 Pandas Dataframe 中选择具有特定数据类型的列

在本文中，我们将看到如何从数据框中选择具有特定数据类型的列。可以使用DataFrame.select_dtypes()执行此操作 pandas模块中的方法。

Syntax: DataFrame.select_dtypes(include=None, exclude=None)
Parameters :
include, exclude : A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
Return : The subset of the frame including the dtypes in include and excluding the dtypes in exclude.

编程需要懂一点英语

循序渐进的方法：

首先，导入模块然后加载数据集。

Python3

# import required module
import pandas as pd
  
# assign dataset
df = pd.read_csv("train.csv")

Python3

# display description
# of the dataset
df.info()

Python3

# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns

Python3

# display columns
print('\nint64 columns:\n', integer_columns)
print('\nfloat64 columns:\n', float_columns)
print('\nobject columns:\n', object_columns)

Python3

# import required module
import pandas as pd
  
# assign dataset
df = pd.read_csv("train.csv")
  
# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns
  
# display columns
print('\nint64 columns:\n',integer_columns)
print('\nfloat64 columns:\n',float_columns)
print('\nobject columns:\n',object_columns)

Python3

# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display dataset
df.sample(10)

Python3

# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display description
# of dataset
df.info()
  
# store columns with specific data type
columns = df.select_dtypes(include=['float64']).columns
  
# display columns
print('\nColumns:\n', columns)

然后我们将使用dataframe.info()方法查找数据集中存在的数据类型。

蟒蛇3

# display description
# of the dataset
df.info()

输出：

现在，我们将使用DataFrame.select_dtypes()来选择特定的数据类型。

蟒蛇3

# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns

最后，显示具有特定数据类型的列。

蟒蛇3

# display columns
print('\nint64 columns:\n', integer_columns)
print('\nfloat64 columns:\n', float_columns)
print('\nobject columns:\n', object_columns)

输出：

以下是基于上述方法的完整程序：

蟒蛇3

# import required module
import pandas as pd
  
# assign dataset
df = pd.read_csv("train.csv")
  
# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns
  
# display columns
print('\nint64 columns:\n',integer_columns)
print('\nfloat64 columns:\n',float_columns)
print('\nobject columns:\n',object_columns)

输出：

例子：

在这里，我们将提取以下数据集的列：

蟒蛇3

# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display dataset
df.sample(10)

输出：

现在，我们将显示所有数据类型为float64的列。

蟒蛇3

# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display description
# of dataset
df.info()
  
# store columns with specific data type
columns = df.select_dtypes(include=['float64']).columns
  
# display columns
print('\nColumns:\n', columns)

输出：