Python中的 pandas.crosstab()函数
此方法用于计算两个(或更多)因素的简单交叉表。默认情况下,除非传递值数组和聚合函数,否则计算因子的频率表。
Syntax: pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name=’All’, dropna=True, normalize=False)
Arguments :
- index : array-like, Series, or list of arrays/Series, Values to group by in the rows.
- columns : array-like, Series, or list of arrays/Series, Values to group by in the columns.
- values : array-like, optional, array of values to aggregate according to the factors. Requires `aggfunc` be specified.
- rownames : sequence, default None, If passed, must match number of row arrays passed.
- colnames : sequence, default None, If passed, must match number of column arrays passed.
- aggfunc : function, optional, If specified, requires `values` be specified as well.
- margins : bool, default False, Add row/column margins (subtotals).
- margins_name : str, default ‘All’, Name of the row/column that will contain the totals when margins is True.
- dropna : bool, default True, Do not include columns whose entries are all NaN.
以下是上述方法的实现以及一些示例:
示例 1:
Python3
# importing packages
import pandas
import numpy
# creating some data
a = numpy.array(["foo", "foo", "foo", "foo",
"bar", "bar", "bar", "bar",
"foo", "foo", "foo"],
dtype=object)
b = numpy.array(["one", "one", "one", "two",
"one", "one", "one", "two",
"two", "two", "one"],
dtype=object)
c = numpy.array(["dull", "dull", "shiny",
"dull", "dull", "shiny",
"shiny", "dull", "shiny",
"shiny", "shiny"],
dtype=object)
# form the cross tab
pandas.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])
Python3
# importing package
import pandas
# create some data
foo = pandas.Categorical(['a', 'b'],
categories=['a', 'b', 'c'])
bar = pandas.Categorical(['d', 'e'],
categories=['d', 'e', 'f'])
# form crosstab with dropna=True (default)
pandas.crosstab(foo, bar)
# form crosstab with dropna=False
pandas.crosstab(foo, bar, dropna=False)
输出 :
示例 2:
Python3
# importing package
import pandas
# create some data
foo = pandas.Categorical(['a', 'b'],
categories=['a', 'b', 'c'])
bar = pandas.Categorical(['d', 'e'],
categories=['d', 'e', 'f'])
# form crosstab with dropna=True (default)
pandas.crosstab(foo, bar)
# form crosstab with dropna=False
pandas.crosstab(foo, bar, dropna=False)
输出 :