如何在Python中执行卡方拟合优度检验
在本文中,我们将了解如何在Python中执行卡方拟合优度检验
卡方拟合优度检验是一种非参数统计假设检验,用于确定事件的观察值与预期值的差异程度。它可以帮助我们检查变量是否来自某个分布,或者样本是否代表总体。将观察到的概率分布与预期的概率分布进行比较。
null hypothesis: A variable has a predetermined distribution.
Alternative hypotheses: A variable deviates from the expected distribution.
示例 1:使用 stats.chisquare()函数
在这种方法中,我们使用 scipy.stats 模块中的 stats.chisquare() 方法,它可以帮助我们确定拟合统计量和 p 值的卡方优度。
Syntax: stats.chisquare(f_obs, f_exp)
parameters:
- f_obs : this parameter contains an array of observed values.
- f_exp : this parameter contains an array of expected values.
在下面的示例中,我们还使用了 stats.ppf() 方法,该方法将参数的显着性水平和自由度作为输入,并为我们提供卡方临界值的值。如果 chi_square_ 值 > 临界值,则拒绝原假设。如果 chi_square_value <= 临界值,则接受原假设。在下面的示例中,卡方值为 5.0127344877344875,临界值为 12.591587243743977。当 chi_square_value <= 时,接受critical_value 零假设,拒绝备择假设。
Python3
# importing packages
import scipy.stats as stats
import numpy as np
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
# Chi-Square Goodness of Fit Test
chi_square_test_statistic, p_value = stats.chisquare(
observed_data, expected_data)
# chi square test statistic and p value
print('chi_square_test_statistic is : ' +
str(chi_square_test_statistic))
print('p_value : ' + str(p_value))
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))
Python3
# importing packages
import scipy.stats as stats
import numpy as np
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
# determining chi square goodness of fit using formula
chi_square_test_statistic1 = 0
for i in range(len(observed_data)):
chi_square_test_statistic1 = chi_square_test_statistic1 + \
(np.square(observed_data[i]-expected_data[i]))/expected_data[i]
print('chi square value determined by formula : ' +
str(chi_square_test_statistic1))
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))
输出:
chi_square_test_statistic is : 5.0127344877344875
p_value : 0.542180861413329
12.591587243743977
示例 2:通过实现公式确定卡方检验统计量
在这种方法中,我们直接实现公式。我们可以看到我们得到了相同的 chi_square 值。
Python3
# importing packages
import scipy.stats as stats
import numpy as np
# no of hours a student studies
# in a week vs expected no of hours
observed_data = [8, 6, 10, 7, 8, 11, 9]
expected_data = [9, 8, 11, 8, 10, 7, 6]
# determining chi square goodness of fit using formula
chi_square_test_statistic1 = 0
for i in range(len(observed_data)):
chi_square_test_statistic1 = chi_square_test_statistic1 + \
(np.square(observed_data[i]-expected_data[i]))/expected_data[i]
print('chi square value determined by formula : ' +
str(chi_square_test_statistic1))
# find Chi-Square critical value
print(stats.chi2.ppf(1-0.05, df=6))
输出:
chi square value determined by formula : 5.0127344877344875
12.591587243743977