线性相关系数公式
相关系数用于衡量两个变量之间的关系有多强。有不同类型的公式来获得相关系数,其中最流行的是皮尔逊相关系数(也称为皮尔逊 R),它通常用于线性回归。皮尔逊相关系数用符号“R”表示。相关系数公式返回一个介于 1 和 -1 之间的值。这里,
- 1 表示强烈的正向关系
- -1 表示强烈的负面关系
- 结果为零表示根本没有关系
线性相关系数公式
线性相关系数称为皮尔逊 r 或皮尔逊相关系数。它反映了两个变量 x 和 y 之间线性关系的方向和强度。它返回一个介于 -1 和 +1 之间的值。其中,-1 表示强负相关,+1 表示强正相关。如果它位于 0 则没有相关性。这也称为零相关。
使用 Pearson 相关性解释相关性强度的“粗略估计”:r value crude estimates +.70 or higher A very strong positive relationship +.40 to +.69 Strong positive relationship +.30 to +.39 Moderate positive relationship +.20 to +.29 weak positive relationship +.01 to +.19 No or negligible relationship 0 No relationship [zero correlation] -.01 to -.19 No or negligible relationship -.20 to -.29 weak negative relationship -.30 to -.39 Moderate negative relationship -.40 to -.69 Strong negative relationship -.70 or higher The very strong negative relationship
用于获取数据的线性相关系数的公式为:
R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
解释线性相关系数的类型?
线性相关系数由 Pearson 的 r 反映。因此,r 的值可以在 +1 和 -1 之间。
线性相关系数有以下三种类型:
Positive values indicate a Positive Correlation (0 Negative values indicate a Negative Correlation (-1r<1) A Value of 0 indicates No Correlation (r=0)
正相关:在正相关中,两个变量都向同一方向移动。如果一个增加另一个也增加,如果一个减少另一个也减少。每当 r 表示正值时,它表示正相关
负相关:在负相关中,两个变量都向不同的方向移动。如果一个增加另一个减少,如果一个减少另一个增加。每当 r 表示负值时,它表示负关系
无相关性:当变量之间没有统计关联时。据说它们没有相关性。在这种情况下,它们的相关系数(也称为 r)为 0。
示例问题
问题1:计算以下数据的相关系数:
X = 5, 9,14, 16
和
Y = 6、10、16、20
解决方案:
Given variables are,
X = 12,16 ,4, 8
and
Y = 15, 20, 55, 10
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula
∑xy = 600
∑x = 40
∑y = 50
∑x² = 470
∑y² = 750
n = 4
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
R = 4(600) – (40)(50) / √[4(470)-(40)²][4(750)-(50)²]
R = 400 / √[320][500]
R = 400/400
R =1
It shows that the relationship between the variables of the data is a very strong positive relationship.
问题2:从下表中找出相关系数的值:X Y XY X² Y² 5 6 180 144 225 9 10 320 256 400 14 16 20 16 20 16 20 80 56 100 ∑40 ∑50 ∑600 ∑480 ∑750
解决方案:
Make a table from the given data and add three more columns of XY, X², and Y² also add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x², and ∑y² and n =6. GLUCOSE LEVEL YSUBJECT AGE X XY X² Y² 1 42 98 4116 1764 9604 2 23 68 1564 529 4624 3 22 73 1606 484 5329 4 47 79 3713 2209 6241 5 50 88 4400 2500 7744 6 60 82 4980 3600 6724 ∑ 244 488 20379 11086 40266
∑xy= 20379
∑x=244
∑y=488
∑x² =11086
∑y² =40266
n =6.
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R = 6(20379) – (244)(488) / √ [6(11086)-(244)²][6(40266)-(488)²
R = 3202 / √ [6980][3452]
R = 3202/4972.238
R = 0.6439
It shows that the relationship between the variables of the data is a strong positive relationship.
问题3:计算以下数据的相关系数:
X = 21,31,25,40,47,38
和
Y = 70,55,60,78,66,80
解决方案:
Given variables are,
X = 21,31,25,40,47,38
and
Y = 70,55,60,78,66,80
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formulaX Y XY X² Y² 21 70 1470 441 4900 31 55 1705 961 3025 25 60 1400 625 3600 40 78 3120 1600 6084 47 66 3102 2209 4356 38 80 3040 1444 6400 ∑202 ∑409 ∑13937 ∑7280 ∑28265
∑xy= 13937
∑x=202
∑y=409
∑x² =7280
∑y² =28265
n =6
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 6(13937) – (202)(409) / √ [6(7280)-(202)²][6(28265)-(409)²]
R= 1004 / √[2876][2909]
R=1004 / 2892.452938
R=-0.3471
It shows that the relationship between the variables of the data is a moderate positive relationship.
问题4:计算以下数据的相关系数:
X= 12、10、42、27、35、56
和
Y = 13、15、56、34、65、26
解决方案:
Given variables are,
X= 12, 10, 42, 27,35,56
and
Y = 13, 15, 56, 34,65,26
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formulaX Y XY X² Y² 12 13 156 144 169 10 15 150 100 225 42 56 2353 1764 3136 27 34 918 729 1156 35 65 2275 1225 4225 56 26 1456 3136 676 ∑182 ∑209 ∑7307 ∑7098 ∑9587
∑xy= 7307
∑x=182
∑y=209
∑x² =7098
∑y² =9587
n =6
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 6(7307) – (182)(209) / √ [6(7098)-(182)²][6(9587)-(209)²]
R= 5804 / √[9464][13841]
R= 5804/11445.139
R=0.5071
It shows that the relationship between the variables of the data is a strong positive relationship.
问题5:给出了一些相关系数来判断变量是正还是负?
0.69
0.42
-0.23
-0.99
解决方案:
The given correlation coefficient is as follows:
0.64
0.46
-0.29
-0.95
Tell whether the relationship is negative or positive
0.64
The relationship between the variables is a strong positive relationship
0.46
The relationship between the variables is a strong positive relationship
-0.29
The relationship between the variables is a weak negative relationship
-0.95
The relationship between the variables is a very strong negative relationship.
问题 6:计算以下数据的相关系数:
X = 10, 13, 15 ,17 ,19
和
Y = 5,10,15,20,25。
解决方案:
Given variables are,
X = 10, 13, 15 ,17 ,19
and
Y = 5,10,15,20,25.
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.X Y XY X² Y² 10 5 50 100 25 13 10 130 169 100 15 15 225 225 225 17 20 340 289 400 19 25 475 361 625 ∑74 ∑75 ∑1103 ∑1144 ∑1375
∑xy= 1103
∑x=74
∑y=75
∑x² =1144
∑y² =1375
n =5
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 5(1103) – (74)(75) / √ [5(1144)-(74)²][5(1375)-(75)²]
R= -35 / √[244][1250]
R= -35/552.26
R=0.0633
It shows that the relationship between the variables of the data is a negligible relationship.
问题7:从下表中找出相关系数的值:SUBJECT AGE X GLUCOSE LEVEL Y 1 42 98 2 23 68 3 22 73 4 47 79 5 50 88 6 60 82
解决方案:
SUBJECT AGE X Weight Y XY X² Y² 1 40 99 3960 1600 9801 2 25 79 1975 625 6241 3 22 69 1518 484 4761 4 54 89 4806 2916 7921 ∑ 151 336 12259 5625 28724
∑xy= 12258
∑x=151
∑y=336
∑x² =5625
∑y² 28724
n =4
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 4(12258) – (151)(336) / √ [4(5625)-(151)²][4(28724)-(336)²]
R= -1704 / √ [-301][-2000]
R=-1704/775.886
R=-2.1961
It shows that the relationship between the variables of the data is a very strong negative relationship.