📜  Python中的 Pandas.cut() 方法

📅  最后修改于: 2022-05-13 01:54:50.941000             🧑  作者: Mango

Python中的 Pandas.cut() 方法

Pandas cut()函数用于将数组元素分隔到不同的 bin 中。 cut函数主要用于对标量数据进行统计分析。

示例 1:假设我们有一个包含 10 个从 1 到 100 的随机数的数组,我们希望将数据分成 5 个 bin (1,20] , (20,40] , (40,60] , (60,80] , (80,100] )。

Python3
import pandas as pd
import numpy as np
 
 
df= pd.DataFrame({'number': np.random.randint(1, 100, 10)})
df['bins'] = pd.cut(x=df['number'], bins=[1, 20, 40, 60,
                                          80, 100])
print(df)
 
# We can check the frequency of each bin
print(df['bins'].unique())


Python3
import pandas as pd
import numpy as np
 
df = pd.DataFrame({'number': np.random.randint(1, 100, 10)})
df['bins'] = pd.cut(x=df['number'], bins=[1, 20, 40, 60, 80, 100],
                    labels=['1 to 20', '21 to 40', '41 to 60',
                            '61 to 80', '81 to 100'])
 
print(df)
 
# We can check the frequency of each bin
print(df['bins'].unique())


输出:

示例 2:我们也可以给我们的 bins 添加标签,例如让我们看看前面的示例并为其添加一些标签

Python3

import pandas as pd
import numpy as np
 
df = pd.DataFrame({'number': np.random.randint(1, 100, 10)})
df['bins'] = pd.cut(x=df['number'], bins=[1, 20, 40, 60, 80, 100],
                    labels=['1 to 20', '21 to 40', '41 to 60',
                            '61 to 80', '81 to 100'])
 
print(df)
 
# We can check the frequency of each bin
print(df['bins'].unique())

输出: