Python|数据分析的数学运算
Python是一种用于进行数据分析的出色语言,主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一,它使导入和分析数据变得更加容易。
可以在 pandas 系列上执行一些重要的数学运算,以简化使用Python进行数据分析并节省大量时间。
要获取使用的数据集,请单击此处。
s=read_csv("stock.csv", squeeze=True)
#reading csv file and making series
Function | Use |
---|---|
s.sum() | Returns sum of all values in the series |
s.mean() | Returns mean of all values in series. Equals to s.sum()/s.count()
|
s.std() | Returns standard deviation of all values |
s.min() or s.max() | Return min and max values from series |
s.idxmin() or s.idxmax() | Returns index of min or max value in series |
s.median() | Returns median of all value |
s.mode() | Returns mode of the series |
s.value_counts() | Returns series with frequency of each value
|
s.describe() | Returns a series with information like mean, mode, etc depending on dtype of data passed
|
代码#1:
Python3
# import pandas for reading csv file
import pandas as pd
#reading csv file
s = pd.read_csv("stock.csv", squeeze = True)
#using count function
print(s.count())
#using sum function
print(s.sum())
#using mean function
print(s.mean())
#calculation average
print(s.sum()/s.count())
#using std function
print(s.std())
#using min function
print(s.min())
#using max function
print(s.max())
#using count function
print(s.median())
#using mode function
print(s.mode())
Python3
# import pandas for reading csv file
import pandas as pd
#reading csv file
s = pd.read_csv("stock.csv", squeeze = True)
#using describe function
print(s.describe())
#using count function
print(s.idxmax())
#using idxmin function
print(s.idxmin())
#count of elements having value 3
print(s.value_counts().head(3))
输出:
3012
1006942.0
334.3100929614874
334.3100929614874
173.18720477113115
49.95
782.22
283.315
0 291.21
代码#2:
Python3
# import pandas for reading csv file
import pandas as pd
#reading csv file
s = pd.read_csv("stock.csv", squeeze = True)
#using describe function
print(s.describe())
#using count function
print(s.idxmax())
#using idxmin function
print(s.idxmin())
#count of elements having value 3
print(s.value_counts().head(3))
输出:
dtype: float64
count 3012.000000
mean 334.310093
std 173.187205
min 49.950000
25% 218.045000
50% 283.315000
75% 443.000000
max 782.220000
Name: Stock Price, dtype: float64
3011
11
291.21 5
288.47 3
194.80 3
Name: Stock Price, dtype: int64
意外输出和限制:
- .sum()、.mean()、.mode()、.median() 和其他此类数学运算不适用于字符串或除数值以外的任何其他数据类型。
- 字符串系列上的 .sum() 会给出意外的输出,并通过连接每个字符串。