在 Pandas 数据框中将列类型从字符串转换为日期时间格式
在 Pandas 中处理数据时,遇到时间序列数据并不少见,我们知道 Pandas 是在Python中处理时间序列数据的非常有用的工具。
让我们看看如何将数据框列的字符串(dd/mm/yyyy 格式)转换为日期时间格式。如果日期格式不正确,我们将无法对日期执行任何基于时间序列的操作。为了能够使用它,我们需要将日期转换为日期时间格式。
代码 #1:使用 pd.to_datetime()函数将 Pandas 数据框列类型从字符串转换为日期时间格式。
Python3
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({'Date':['11/8/2011', '04/23/2008', '10/2/2019'],
'Event':['Music', 'Poetry', 'Theatre'],
'Cost':[10000, 5000, 15000]})
# Print the dataframe
print(df)
# Now we will check the data type
# of the 'Date' column
df.info()
Python3
# convert the 'Date' column to datetime format
df['Date']= pd.to_datetime(df['Date'])
# Check the format of 'Date' column
df.info()
Python3
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({'Date':['11/8/2011', '04/23/2008', '10/2/2019'],
'Event':['Music', 'Poetry', 'Theatre'],
'Cost':[10000, 5000, 15000]})
# Print the dataframe
print(df)
# Now we will check the data type
# of the 'Date' column
df.info()
Python3
# convert the 'Date' column to datetime format
df['Date'] = df['Date'].astype('datetime64[ns]')
# Check the format of 'Date' column
df.info()
Python3
# importing pandas library
import pandas as pd
# Initializing the nested list with Data set
player_list = [['200712',50000],['200714',51000],['200716',51500],
['200719',53000],['200721',54000],
['200724',55000],['200729',57000]]
# creating a pandas dataframe
df = pd.DataFrame(player_list,columns=['Dates','Patients'])
# printing dataframe
print(df)
print()
# checking the type
print(df.dtypes)
Python3
# converting the string to datetime format
df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')
# printing dataframe
print(df)
print()
print(df.dtypes)
Python3
# importing pandas library
import pandas as pd
# Initializing the nested list with Data set
player_list = [['20200712',50000,'20200812'],
['20200714',51000,'20200814'],
['20200716',51500,'20200816'],
['20200719',53000,'20200819'],
['20200721',54000,'20200821'],
['20200724',55000,'20200824'],
['20200729',57000,'20200824']]
# creating a pandas dataframe
df = pd.DataFrame(
player_list,columns = ['Treatment_start',
'No.of Patients',
'Treatment_end'])
# printing dataframe
print(df)
print()
# checking the type
print(df.dtypes)
Python3
# converting the string to datetime
# format in multiple columns
df['Treatment_start'] = pd.to_datetime(
df['Treatment_start'],
format='%Y%m%d'
)
df['Treatment_end'] = pd.to_datetime(
df['Treatment_end'],
format='%Y%m%d'
)
# printing dataframe
print(df)
print()
print(df.dtypes)
输出:
正如我们在输出中看到的,“日期”列的数据类型是对象,即字符串。现在我们将使用 pd.to_datetime()函数将其转换为日期时间格式。
Python3
# convert the 'Date' column to datetime format
df['Date']= pd.to_datetime(df['Date'])
# Check the format of 'Date' column
df.info()
输出:
正如我们在输出中看到的,“日期”列的格式已更改为日期时间格式。代码 #2:使用 DataFrame.astype()函数将 Pandas 数据框列类型从字符串转换为日期时间格式。
Python3
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({'Date':['11/8/2011', '04/23/2008', '10/2/2019'],
'Event':['Music', 'Poetry', 'Theatre'],
'Cost':[10000, 5000, 15000]})
# Print the dataframe
print(df)
# Now we will check the data type
# of the 'Date' column
df.info()
输出 :
正如我们在输出中看到的,“日期”列的数据类型是对象,即字符串。现在我们将使用 DataFrame.astype()函数将其转换为日期时间格式。
Python3
# convert the 'Date' column to datetime format
df['Date'] = df['Date'].astype('datetime64[ns]')
# Check the format of 'Date' column
df.info()
输出 :
正如我们在输出中看到的,“日期”列的格式已更改为日期时间格式。
代码 #3:如果数据框列是 'yymmdd' 格式,我们必须将其转换为 'yyyymmdd' 格式
Python3
# importing pandas library
import pandas as pd
# Initializing the nested list with Data set
player_list = [['200712',50000],['200714',51000],['200716',51500],
['200719',53000],['200721',54000],
['200724',55000],['200729',57000]]
# creating a pandas dataframe
df = pd.DataFrame(player_list,columns=['Dates','Patients'])
# printing dataframe
print(df)
print()
# checking the type
print(df.dtypes)
![](https://mangodoc.oss-cn-beijing.aliyuncs.com/geek8geeks/Convert_the_column_type_from_string_to_datetime_format_in_Pandas_dataframe_6.png)
Python3
# converting the string to datetime format
df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')
# printing dataframe
print(df)
print()
print(df.dtypes)
![](https://mangodoc.oss-cn-beijing.aliyuncs.com/geek8geeks/Convert_the_column_type_from_string_to_datetime_format_in_Pandas_dataframe_7.png)
在上面的示例中,我们将“日期”列的数据类型从“对象”更改为“日期时间64[ns] ”,并将格式从“yymmdd”更改为“yyyymmdd”。
代码 #4:使用 pandas.to_datetime() 将多个列从字符串转换为'yyyymmdd ' 格式
Python3
# importing pandas library
import pandas as pd
# Initializing the nested list with Data set
player_list = [['20200712',50000,'20200812'],
['20200714',51000,'20200814'],
['20200716',51500,'20200816'],
['20200719',53000,'20200819'],
['20200721',54000,'20200821'],
['20200724',55000,'20200824'],
['20200729',57000,'20200824']]
# creating a pandas dataframe
df = pd.DataFrame(
player_list,columns = ['Treatment_start',
'No.of Patients',
'Treatment_end'])
# printing dataframe
print(df)
print()
# checking the type
print(df.dtypes)
![](https://mangodoc.oss-cn-beijing.aliyuncs.com/geek8geeks/Convert_the_column_type_from_string_to_datetime_format_in_Pandas_dataframe_8.png)
Python3
# converting the string to datetime
# format in multiple columns
df['Treatment_start'] = pd.to_datetime(
df['Treatment_start'],
format='%Y%m%d'
)
df['Treatment_end'] = pd.to_datetime(
df['Treatment_end'],
format='%Y%m%d'
)
# printing dataframe
print(df)
print()
print(df.dtypes)
![](https://mangodoc.oss-cn-beijing.aliyuncs.com/geek8geeks/Convert_the_column_type_from_string_to_datetime_format_in_Pandas_dataframe_9.png)
在上面的示例中,我们将列“ Treatment_start ”和“ Treatment_end ”的数据类型从“ object ”更改为“ datetime64[ns] ”类型。