如何从Python中的字典创建交叉表?
在本文中,我们将了解如何在Python中从字典创建交叉表。 pandas 交叉表函数构建了一个交叉表,可以显示某些数据组出现的频率。
此方法用于计算两个(或更多)因素的简单交叉表。默认情况下,除非传递值数组和聚合函数,否则计算因子的频率表。
Syntax: pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name=’All’, dropna=True, normalize=False)
Arguments :
- index : array-like, Series, or list of arrays/Series, Values to group by in the rows.
- columns : array-like, Series, or list of arrays/Series, Values to group by in the columns.
- values : array-like, optional, array of values to aggregate according to the factors. Requires `aggfunc` be specified.
- rownames : sequence, default None, If passed, must match number of row arrays passed.
- colnames : sequence, default None, If passed, must match number of column arrays passed.
- aggfunc : function, optional, If specified, requires `values` be specified as well.
- margins : bool, default False, Add row/column margins (subtotals).
- margins_name : str, default ‘All’, Name of the row/column that will contain the totals when margins is True.
- dropna : bool, default True, Do not include columns whose entries are all NaN.
*** QuickLaTeX cannot compile formula:
*** Error message:
Error: Nothing to show, formula is empty
逐步实施:
第 1 步:创建字典。
Python3
raw_data = {'Digimon': ['Kuramon', 'Pabumon', 'Punimon',
'Botamon', 'Poyomon', 'Koromon',
'Tanemon', 'Tsunomon', 'Tsumemon',
'Tokomon'],
'Stage': ['Baby', 'Baby', 'Baby', 'Baby', 'Baby',
'In-Training', 'In-Training', 'In-Training',
'In-Training', 'In-Training'],
'Type': ['Free', 'Free', 'Free', 'Free', 'Free', 'Free',
'Free', 'Free', 'Free', 'Free'],
'Attribute': ['Neutral', 'Neutral', 'Neutral',
'Neutral', 'Neutral', 'Fire', 'Plant',
'Earth', 'Dark', 'Neutral'],
'Memory': [2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
'Equip Slots': [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
'Lv 50 HP': [324, 424, 5343, 52, 63, 42,
643, 526, 42, 75],
'Lv50 SP': [86, 75, 64, 43, 86, 64, 344,
24, 24, 12],
'Lv50 Atk': [86, 74, 6335, 421, 23, 36436,
65, 75, 86, 52]}
print(raw_data)
Python3
import pandas as pd
raw_data_df = pd.DataFrame(raw_data,columns= ['Digimon','Stage',
'Type', 'Attribute',
'Memory','Equip Slots',
'Lv 50 HP','Lv50 SP',
'Lv50 Atk'])
print(raw_data_df)
Python3
import pandas as pd
raw_data_df=pd.DataFrame(raw_data,columns= ['Digimon','Stage',
'Type',
'Attribute','Memory',
'Equip Slots',
'Lv 50 HP','Lv50 SP',
'Lv50 Atk'])
print(raw_data_df)
Python3
raw_data_fd = pd.crosstab(
[raw_data_df['Attribute'], raw_data_df['Memory']],
raw_data_df['Digimon'], margins=True)
raw_data_fd
输出:
{‘Digimon’: [‘Kuramon’, ‘Pabumon’, ‘Punimon’, ‘Botamon’, ‘Poyomon’, ‘Koromon’, ‘Tanemon’, ‘Tsunomon’, ‘Tsumemon’, ‘Tokomon’], ‘Stage’: [‘Baby’, ‘Baby’, ‘Baby’, ‘Baby’, ‘Baby’, ‘In-Training’, ‘In-Training’, ‘In-Training’, ‘In-Training’, ‘In-Training’], ‘Type’: [‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’, ‘Free’], ‘Attribute’: [‘Neutral’, ‘Neutral’, ‘Neutral’, ‘Neutral’, ‘Neutral’, ‘Fire’, ‘Plant’, ‘Earth’, ‘Dark’, ‘Neutral’], ‘Memory’: [2, 2, 2, 2, 2, 3, 3, 3, 3, 3], ‘Equip Slots’: [0, 0, 1, 1, 1, 1, 1, 1, 1, 1], ‘Lv 50 HP’: [324, 424, 5343, 52, 63, 42, 643, 526, 42, 75], ‘Lv50 SP’: [86, 75, 64, 43, 86, 64, 344, 24, 24, 12], ‘Lv50 Atk’: [86, 74, 6335, 421, 23, 36436, 65, 75, 86, 52]}
第 2 步:使用 Pandas Dataframe函数创建一个数据框。
Python3
*** QuickLaTeX cannot compile formula:
*** Error message:
Error: Nothing to show, formula is empty
输出:
第 3 步:使用交叉表。
Python3
import pandas as pd
raw_data_df = pd.DataFrame(raw_data,columns= ['Digimon','Stage',
'Type', 'Attribute',
'Memory','Equip Slots',
'Lv 50 HP','Lv50 SP',
'Lv50 Atk'])
print(raw_data_df)
输出:
您也可以将多个索引(行)添加到交叉表。这可以通过将变量列表传递给交叉表函数来完成,如果您想按区域和季度分解项目,您可以将它们传递给 index 参数。
Python3
import pandas as pd
raw_data_df=pd.DataFrame(raw_data,columns= ['Digimon','Stage',
'Type',
'Attribute','Memory',
'Equip Slots',
'Lv 50 HP','Lv50 SP',
'Lv50 Atk'])
print(raw_data_df)
输出