皮尔逊相关系数
相关系数用于衡量两个变量之间的关系有多强。有不同类型的公式来获得相关系数,其中最流行的是皮尔逊相关系数(也称为皮尔逊 R),它通常用于线性回归。皮尔逊相关系数用符号“R”表示。相关系数公式返回一个介于 1 和 -1 之间的值。这里,
- -1 表示强烈的负相关
- 1 表示强烈的正向关系
- 结果为零表示根本没有关系
皮尔逊相关系数公式
皮尔逊相关系数公式是获取相关系数最常用和最流行的公式。它用大写的“R”表示。皮尔逊相关系数的公式如下所示,
R= n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
皮尔逊相关系数公式的全称是皮尔逊乘积矩相关性(PPMC)。它有助于显示两组数据之间的线性关系。
皮尔逊相关性有助于测量两个变量之间线性关系的强度(由系数 r 值在 -1 和 +1 之间给出)和存在性(由 p 值给出),如果结果显着,我们得出结论存在相关性。
Cohen (1988) 说 r 的绝对值 0.5 被归类为大,0.3 的绝对值被归类为中,0.1 的绝对值被归类为小。
Pearson相关系数的解释如下:-
- 相关系数为 1 意味着对于一个变量的每一个正增长,其他固定比例的正增长。就像,鞋子的尺寸与脚的长度完全相关。
- 如果相关系数为 0,则表明变量之间不存在相关性。
- 相关系数为 -1 意味着对于一个变量的每一个正增加,都会有一个固定比例的负减少。就像,水箱中的水量会随着水龙头的流量而减少。
用皮尔逊相关系数公式求相关系数的步骤:
Step 1: Firstly make a chart with the given data like subject, x, and y and add three more columns in it xy,x² and y².
Step 2: Now multiply the x and y columns to fill the xy column. For example:- in x we have 24 and in y we have 65 so xy will be 24×65=1560.
Step 3: Now, take the square of the numbers in the x column and fill the x² column.
Step 4: Now, take the square of the numbers in the y column and fill the y² column.
Step 5: Now, add up all the values in the columns and put the result at the bottom. Greek letter sigma (Σ) is the short way of saying summation.
Step 6: Now, use the formula for Pearson’s correlation coefficient:-
R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
要知道我们有哪种类型的变量是正面的还是负面的。
示例问题
问题1:给出了一些相关系数来判断变量是正还是负?
0.69、0.42、-0.23、-0.99
解决方案:
The given correlation coefficient is as follows:
0.69, 0.42, -0.23, -0.99
Tell whether the relationship is negative or positive
0.69: The relationship between the variables is a strong positive relationship
0.42: The relationship between the variables is a strong positive relationship
-0.23: The relationship between the variables is a weak negative relationship
-0.99: The relationship between the variables is a very strong negative relationship
问题2:借助皮尔逊相关系数公式计算以下数据的相关系数:
X = 10, 13, 15 ,17 ,19
和
Y = 5,10,15,20,25。
解决方案:
Given variables are,
X = 10, 13, 15 ,17 ,19
and
Y = 5,10,15,20,25.
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.
∑xy = 1103
∑x = 74
∑y = 75
∑x² = 1144
∑y² = 1375
n = 5
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R = 5(1103) – (74)(75) / √ [5(1144)-(74)²][5(1375)-(75)²]
R = -35 / √[244][1250]
R = -35/552.26
R = 0.0633
The correlation coefficient is 0.064
问题3:借助皮尔逊相关系数公式计算下表的相关系数:X Y XY X² Y² 10 5 50 100 25 13 10 130 169 100 15 15 225 225 225 17 20 340 289 400 19 25 475 362 625 ∑74 ∑75 ∑1103 ∑1144 ∑1375
解决方案:
Make a table from the given data and add three more columns of XY, X², and Y². also add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x², and ∑y² and n =4.SUBJECT AGE X Weight Y XY X² Y² 1 40 99 3960 1600 9801 2 25 79 1975 625 6241 3 22 69 1518 484 4761 4 54 89 4806 2916 7921 ∑ 151 336 12259 5625 28724
∑xy = 12258
∑x = 151
∑y = 336
∑x² = 5625
∑y² = 28724
n = 4
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R = 4(12258) – (151)(336) / √ [4(5625)-(151)²][4(28724)-(336)²]
R = -1704 / √ [-301][-2000]
R = -1704/775.886
R = -2.1961
The correlation coefficient is -2.196
问题4:借助皮尔逊相关系数公式计算以下数据的相关系数:
X = 5 ,9 ,14, 16
和
Y = 6、10、16、20。
解决方案:
Given variables are,
X = 5 ,9 ,14, 16
and
Y = 6, 10, 16, 20 .
To, find the correlation coefficient of the following variables Firstly a table to be constructed as follows, to get the values required in the formula
also, add all the values in the columns to get the values used in the formula.X Y XY X² Y² 5 6 30 25 36 9 10 90 81 100 14 16 224 196 256 16 20 320 256 400 ∑ 44 ∑ 52 ∑ 664 ∑ 558 ∑ 792
∑xy= 664
∑x=44
∑y=52
∑x² =558
∑y² =792
n =4
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 4(664) – (44)(52) / √ [4(558)-(44)²][4(792)-(52)²]
R= 368 / √[296][464]
R=368/370.599
R=0.994
The correlation coefficient is 0.994
问题5:借助皮尔逊相关系数公式计算以下数据的相关系数:
X = 21,31,25,40,47,38
和
Y = 70,55,60,78,66,80
解决方案:
Given variables are,
X = 21,31,25,40,47,38
and
Y = 70,55,60,78,66,80
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.X Y XY X² Y² 21 70 1470 441 4900 31 55 1705 961 3025 25 60 1500 625 3600 40 78 3120 1600 6084 47 66 3102 2209 4356 38 80 3040 1444 6400 ∑202 ∑409 ∑13937 ∑7280 ∑28265
∑xy= 13937
∑x=202
∑y=409
∑x² =7280
∑y² =28265
n =6
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 6(13937) – (202)(409) / √ [6(7280)-(202)²][6(28265)-(409)²]
R= 1004 / √[2876][2909]
R=1004 / 2892.452938
R=-0.3471
The correlation coefficient is -0.3471
问题6:借助皮尔逊相关系数公式计算以下数据的相关系数:SUBJECT AGE X Weight Y 1 40 99 2 25 79 3 22 69 4 54 89
解决方案:
Make a table from the given data and add three more columns of XY , X² and Y² and add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x² and ∑y² and n =4.SUBJECT Height X Weight Y XY X² Y² 1 43 78 3354 1849 6084 2 24 68 1632 567 4624 3 26 85 2210 676 7225 4 35 67 2345 1225 4889 ∑ 128 298 9541 4317 22422
∑xy= 9541
∑x=128
∑y=298
∑x² =4317
∑y² 22422
n =4
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 4(9541) – (128)(298) / √ [4(4317)-(128)²][4(22422)-(298)²]
R= 20 / √ [884][884]
R=20/884
R=0.02262
The correlation coefficient is 0.02262