毫升 | Kolmogorov-Smirnov 检验

Kolmogorov-Smirnov 测试是一种非常有效的方法来确定两个样本是否彼此显着不同。它通常用于检查随机数的一致性。均匀性是任何随机数生成器最重要的属性之一，可以使用 Kolmogorov-Smirnov 检验对其进行测试。
Kolmogorov-Smirnov 检验也可用于检验两个潜在的一维概率分布是否不同。这是确定两个样本是否彼此显着不同的一种非常有效的方法。

The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples.

编程需要懂一点英语

为了使用测试来检查随机数的均匀性，我们使用 U[0, 1] 的 CDF（累积分布函数）。

F(x)=x  for 0<=x<=1

Empirical CDF, Sn(x)= (number of R1, R2…Rn < x) / N 随机数数组，随机数必须在 [0, 1] 范围内。

使用的假设——

H ₀ （零假设）：零假设假设数字均匀分布在 0-1 之间。
如果我们能够拒绝零假设，这意味着数字在 0-1 之间不是均匀分布的。未能拒绝零假设虽然并不一定意味着数字服从均匀分布。

scipy Python中的kstest函数–

参数：

Statistics: This is the calculated value of D, where D=|F(x)-Sn(x)|.
-> This D is compared with D_alpha where alpha is the level of significance. Alpha is defined as the probability of rejecting the null hypothesis given the null hypothesis(H₀) is true. For most of the practical applications, alpha is chosen as 0.05.
p-value: This is calculated with the help of D.
-> If pvalue> alpha, we fail to reject the null hypothesis. Otherwise, we conclude that the numbers are not uniform. Ideally, the p-value should be as large as possible. For perfect uniform distribution pvalue=1 and Statistics=0.

编程需要懂一点英语

Python3

from scipy.stats import kstest
import random
 
# N = int(input("Enter number of random numbers: "))
N = 5
 
actual =[]
print("Enter outcomes: ")
for i in range(N):
    # x = float(input("Outcomes of class "+str(i + 1)+": "))
    actual.append(random.random())
 
print(actual)
x = kstest(actual, "uniform")  
print(x)

Python3

from scipy.stats import kstest
import random
 
# N = int(input("Enter number of random numbers: "))
N = 10
 
actual =[]
print("Enter outcomes: ")
 
for i in range(N):
    # x = float(input("Outcomes of class "+str(i + 1)+": "))
    actual.append(random.random())
 
print(actual)
x = kstest(actual, "norm")  
print(x)

输出：

KS Test是一种非常强大的方法，可以自动区分来自不同分布的样本。 kstest函数也可用于检查给出的数据是否服从正态分布。它将观察到的正态分布的相对频率与预期的累积相对频率进行比较。 Kolmogorov-Smirnov 检验使用观察到的和预期的累积分布之间的最大绝对差。

这里使用的零假设假设数字服从正态分布。
该函数的函数保持完全相同。它再次返回统计数据和 p 值。如果 p 值 < alpha，我们拒绝 Null 假设。

Python3

from scipy.stats import kstest
import random
 
# N = int(input("Enter number of random numbers: "))
N = 10
 
actual =[]
print("Enter outcomes: ")
 
for i in range(N):
    # x = float(input("Outcomes of class "+str(i + 1)+": "))
    actual.append(random.random())
 
print(actual)
x = kstest(actual, "norm")  
print(x)

输出：