创建 Pandas 数据框的不同方法
Pandas DataFrame是一种二维标记数据结构,具有可能不同类型的列。它通常是最常用的 pandas 对象。
Pandas DataFrame 可以通过多种方式创建。让我们一一讨论创建DataFrame的不同方法。
方法#1:从列表中创建 Pandas DataFrame。
Python3
# Import pandas library
import pandas as pd
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])
# print dataframe.
df
Python3
# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
import pandas as pd
# initialize data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df
Python3
# Python code demonstrate creating
# pandas DataFrame with indexed by
# DataFrame using arrays.
import pandas as pd
# initialize data of lists.
data = {'Name':['Tom', 'Jack', 'nick', 'juli'],
'marks':[99, 98, 95, 90]}
# Creates pandas DataFrame.
df = pd.DataFrame(data, index =['rank1',
'rank2',
'rank3',
'rank4'])
# print the data
df
Python3
# Python code demonstrate how to create
# Pandas DataFrame by lists of dicts.
import pandas as pd
# Initialize data to lists.
data = [{'a': 1, 'b': 2, 'c':3},
{'a':10, 'b': 20, 'c': 30}]
# Creates DataFrame.
df = pd.DataFrame(data)
# Print the data
df
Python3
# Python code demonstrate to create
# Pandas DataFrame by passing lists of
# Dictionaries and row indices.
import pandas as pd
# Initialize data of lists
data = [{'b': 2, 'c':3}, {'a': 10, 'b': 20, 'c': 30}]
# Creates pandas DataFrame by passing
# Lists of dictionaries and row index.
df = pd.DataFrame(data, index =['first', 'second'])
# Print the data
df
Python3
# Python code demonstrate to create a
# Pandas DataFrame with lists of
# dictionaries as well as
# row and column indexes.
import pandas as pd
# Initialize lists data.
data = [{'a': 1, 'b': 2},
{'a': 5, 'b': 10, 'c': 20}]
# With two column indices, values same
# as dictionary keys
df1 = pd.DataFrame(data, index =['first',
'second'],
columns =['a', 'b'])
# With two column indices with
# one index with other name
df2 = pd.DataFrame(data, index =['first',
'second'],
columns =['a', 'b1'])
# print for first data frame
print (df1, "\n")
# Print for second DataFrame.
print (df2)
Python3
# Python program to demonstrate creating
# pandas Datadaframe from lists using zip.
import pandas as pd
# List1
Name = ['tom', 'krish', 'nick', 'juli']
# List2
Age = [25, 30, 26, 22]
# get the list of tuples from two lists.
# and merge them by using zip().
list_of_tuples = list(zip(Name, Age))
# Assign data to tuples.
list_of_tuples
# Converting lists of tuples into
# pandas Dataframe.
df = pd.DataFrame(list_of_tuples,
columns = ['Name', 'Age'])
# Print data.
df
Python3
# Python code demonstrate creating
# Pandas Dataframe from Dicts of series.
import pandas as pd
# Initialize data to Dicts of series.
d = {'one' : pd.Series([10, 20, 30, 40],
index =['a', 'b', 'c', 'd']),
'two' : pd.Series([10, 20, 30, 40],
index =['a', 'b', 'c', 'd'])}
# creates Dataframe.
df = pd.DataFrame(d)
# print the data.
df
输出:
方法 #2:从 narray/lists 的 dict 创建 DataFrame
要从 narray/list 的 dict 创建 DataFrame,所有 narray 必须具有相同的长度。如果传递了索引,则长度索引应等于数组的长度。如果没有传递索引,则默认情况下,索引将是 range(n),其中 n 是数组长度。
Python3
# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
import pandas as pd
# initialize data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df
输出:
方法 #3:使用数组创建索引 DataFrame。
Python3
# Python code demonstrate creating
# pandas DataFrame with indexed by
# DataFrame using arrays.
import pandas as pd
# initialize data of lists.
data = {'Name':['Tom', 'Jack', 'nick', 'juli'],
'marks':[99, 98, 95, 90]}
# Creates pandas DataFrame.
df = pd.DataFrame(data, index =['rank1',
'rank2',
'rank3',
'rank4'])
# print the data
df
输出:
方法#4:从字典列表创建数据框
Pandas DataFrame 可以通过将字典列表作为输入数据来创建。默认情况下,字典键被视为列。
Python3
# Python code demonstrate how to create
# Pandas DataFrame by lists of dicts.
import pandas as pd
# Initialize data to lists.
data = [{'a': 1, 'b': 2, 'c':3},
{'a':10, 'b': 20, 'c': 30}]
# Creates DataFrame.
df = pd.DataFrame(data)
# Print the data
df
输出:
另一个通过传递字典和行索引列表来创建 pandas DataFrame 的示例。
Python3
# Python code demonstrate to create
# Pandas DataFrame by passing lists of
# Dictionaries and row indices.
import pandas as pd
# Initialize data of lists
data = [{'b': 2, 'c':3}, {'a': 10, 'b': 20, 'c': 30}]
# Creates pandas DataFrame by passing
# Lists of dictionaries and row index.
df = pd.DataFrame(data, index =['first', 'second'])
# Print the data
df
输出:
从具有行索引和列索引的字典列表创建 pandas DataFrame 的另一个示例。
Python3
# Python code demonstrate to create a
# Pandas DataFrame with lists of
# dictionaries as well as
# row and column indexes.
import pandas as pd
# Initialize lists data.
data = [{'a': 1, 'b': 2},
{'a': 5, 'b': 10, 'c': 20}]
# With two column indices, values same
# as dictionary keys
df1 = pd.DataFrame(data, index =['first',
'second'],
columns =['a', 'b'])
# With two column indices with
# one index with other name
df2 = pd.DataFrame(data, index =['first',
'second'],
columns =['a', 'b1'])
# print for first data frame
print (df1, "\n")
# Print for second DataFrame.
print (df2)
输出:
方法 #5:使用 zip()函数创建 DataFrame。
使用 list(zip())函数可以合并两个列表。现在,通过调用 pd.DataFrame() 函数创建 pandas DataFrame。
Python3
# Python program to demonstrate creating
# pandas Datadaframe from lists using zip.
import pandas as pd
# List1
Name = ['tom', 'krish', 'nick', 'juli']
# List2
Age = [25, 30, 26, 22]
# get the list of tuples from two lists.
# and merge them by using zip().
list_of_tuples = list(zip(Name, Age))
# Assign data to tuples.
list_of_tuples
# Converting lists of tuples into
# pandas Dataframe.
df = pd.DataFrame(list_of_tuples,
columns = ['Name', 'Age'])
# Print data.
df
输出:
方法 #6:从系列的字典创建 DataFrame。
要从系列的字典创建 DataFrame,可以传递字典以形成 DataFrame。结果索引是所有通过索引的系列的并集。
Python3
# Python code demonstrate creating
# Pandas Dataframe from Dicts of series.
import pandas as pd
# Initialize data to Dicts of series.
d = {'one' : pd.Series([10, 20, 30, 40],
index =['a', 'b', 'c', 'd']),
'two' : pd.Series([10, 20, 30, 40],
index =['a', 'b', 'c', 'd'])}
# creates Dataframe.
df = pd.DataFrame(d)
# print the data.
df
输出: