Python| Pandas.Categorical()
pandas.Categorical(val, categories = None, ordered = None, dtype = None) :它表示一个分类变量。 Categorical 是 pandas 数据类型,对应于统计中的分类变量。此类变量具有固定且有限数量的可能值。例如——成绩、性别、血型等。
此外,在分类变量的情况下,逻辑顺序与分类数据不同,例如“一”、“二”、“三”。但是这些变量的排序使用逻辑顺序。
Parameters- val : [list-like] The values of categorical.
categories : [index like] Unique categorisation of the categories.
ordered : [boolean] If false, then the categorical is treated as unordered.
dtype : [CategoricalDtype] an instance.
Error- ValueError : If the categories do not validate.
TypeError : If an explicit ordered = True but categorical can't be sorted.
Return- Categorical variable
代码:
Python3
# Python code explaining
# numpy.pandas.Categorical()
# importing libraries
import numpy as np
import pandas as pd
# Categorical using dtype
c = pd.Series(["a", "b", "d", "a", "d"], dtype ="category")
print ("\nCategorical without pandas.Categorical() : \n", c)
c1 = pd.Categorical([1, 2, 3, 1, 2, 3])
print ("\n\nc1 : ", c1)
c2 = pd.Categorical(['e', 'm', 'f', 'i',
'f', 'e', 'h', 'm' ])
print ("\nc2 : ", c2)
Python3
# Ordered = True
c3 = pd.Categorical(['e', 'm', 'f', 'i',
'f', 'e', 'h', 'm' ], ordered = True)
print ("\nc3 : ", c3)
Python3
# Mixed categories
c4 = pd.Categorical(['a', 2, 3, 1, 2, 3])
print ("\nc4 : ", c4)
c5 = pd.Categorical(['a', 2, 3, 1, 2, 3], ordered = True)
print ("\nc5 : ", c5)
Python3
# using categories attribute
c6 = pd.Categorical([1, 2, 3, 1, 2, 3], categories = [4, 1, 3, 5])
print ("\nc6 : ", c6)
print("\n\nSeries : \n", pd.Series(c6))
df = pd.DataFrame({"A":[1, 2, 3, 1, 2, 3]})
df["B"] = c6
print ("\n\nDataframe : \n", df)
输出 :
Python3
# Ordered = True
c3 = pd.Categorical(['e', 'm', 'f', 'i',
'f', 'e', 'h', 'm' ], ordered = True)
print ("\nc3 : ", c3)
输出 :
Python3
# Mixed categories
c4 = pd.Categorical(['a', 2, 3, 1, 2, 3])
print ("\nc4 : ", c4)
c5 = pd.Categorical(['a', 2, 3, 1, 2, 3], ordered = True)
print ("\nc5 : ", c5)
输出 :
Python3
# using categories attribute
c6 = pd.Categorical([1, 2, 3, 1, 2, 3], categories = [4, 1, 3, 5])
print ("\nc6 : ", c6)
print("\n\nSeries : \n", pd.Series(c6))
df = pd.DataFrame({"A":[1, 2, 3, 1, 2, 3]})
df["B"] = c6
print ("\n\nDataframe : \n", df)
输出 :