在 R 编程中使用符号函数可视化相关矩阵
相关性是指两个变量之间的关系。它是指任意两个随机变量之间线性相关的程度。这种关系可以表示为区间 [-1, 1] 内表示的值范围。值 -1 表示完美的非线性(负)关系,1 是完美的正线性关系,0 是介于正负线性相互依赖性之间的中间值。但是,值为 0 并不表示变量是独立的。相关矩阵计算一组随机变量之间的线性关系程度,一次考虑一对。
相关矩阵的性质
- 相关矩阵的所有对角元素必须为 1,因为变量与其自身的相关性总是完美的,例如 C ii = 1
- 它应该是对称的,例如 Cij = C ji
在 R 中的实现
R 有一个内置函数symnum() ,它可以用来轻松地可视化各种变量之间的校正程度。它可以很容易地将高度相关的变量与其他变量分开。相关系数根据相关程度由符号代替。在 R 中, symnum()函数具有以下语法:
Syntax:
symnum(arr, cutpoints = c (0.3, 0.6, 0.8, 0.9, 0.95), symbols = c (” “, “.”, “,”, “+”, “*”, “B”))
Parameter:
arr = logical or numerical array
cutpoints = correlation coefficients cutpoints, for eg, coefficient between 0.3-0.6 are replaced by (“.”). Diagonal elements are replaced by 1.
symbols = an array of symbols to denote values of correlation coefficients with the number of symbols is always 1 greater than the cutpoints.
Note: The correlation coefficients in the arr must be between -1 and 1.
一维数值数组的可视化:
R
# defining a single dimension array
arr <- c(6, 4, 3, 2, 5, 1, 8, 7)
# cut values are determined at an interval of 2
# symbols are specified by sym
symnum(arr, cut = c(0, 2, 4, 6, 8),
sym = c(".", "-", "+", "$"))
R
# the logical condition is a parameter
# to the symnum function
# the default values assigned are | symbol
# for true values and . for false values
symnum(1:7 %% 2 == 0)
R
# R program to illustrate
# Coorelation Matrix
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rnorm(30), 10, 3));
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
# visualising the relation between various elements
print("Symbolic symnum representation")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))
R
# R program to illustrate
# Coorelation Matrix
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rexp(30, 1), 5, 5));
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
# visualising the relation between various
# elements without diagonal elements
print("Symbolic symnum representation with false diagonal")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
diag = FALSE)
# setting lower = false
print("COmplete symnum matrix ")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
lower = FALSE)
输出:
[1] + - - . + . $ $
attr(,"legend")
[1] 0 '.' 2 '-' 4 '+' 6 '$' 8
说明: 0-2 范围内的所有值,包括 2,都用“.”表示,同样在 6-8 范围内,用“$”表示。因此,输出指示基于切点和符号的 arr 值的表示。
一维逻辑数组的可视化:
以下代码片段指示了 symnum()函数在逻辑数组上的应用。
电阻
# the logical condition is a parameter
# to the symnum function
# the default values assigned are | symbol
# for true values and . for false values
symnum(1:7 %% 2 == 0)
输出:
[1] . | . | . | .
说明:根据数组 %2 的条件计算数组值,并将相应的结果计算为一个符号,“1”基于 TRUE 和“.”。基于 FALSE。
在 R 中可视化相关矩阵
cor()函数可以很容易地创建相关矩阵。
Syntax: cor (x; use = )
Parameter:
x: numeric matrix or a data frame use deals with missing values.
该函数输出一个相关系数矩阵,然后可以将其输入symnum()函数以关注高度相关的值,这些值来自函数的符号数组参数中指定的符号。
电阻
# R program to illustrate
# Coorelation Matrix
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rnorm(30), 10, 3));
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
# visualising the relation between various elements
print("Symbolic symnum representation")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))
输出:
[1] "Correlation matrix"
> print (mat)
[,1] [,2] [,3]
[1,] 1.0000000 0.1295918 0.1137502
[2,] 0.1295918 1.0000000 0.2967970
[3,] 0.1137502 0.2967970 1.0000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation")
[1] "Symbolic symnum representation"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))
[1,] 1
[2,] | 1
[3,] | | 1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
说明:默认打印下对角矩阵,其中值表示描述关系程度的符号。
symnum()函数中可以更改各种参数。以下代码片段指示了 params 的用法:
电阻
# R program to illustrate
# Coorelation Matrix
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rexp(30, 1), 5, 5));
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
# visualising the relation between various
# elements without diagonal elements
print("Symbolic symnum representation with false diagonal")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
diag = FALSE)
# setting lower = false
print("COmplete symnum matrix ")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
lower = FALSE)
输出:
[1] "Correlation matrix"
> print (mat)
[,1] [,2] [,3] [,4] [,5]
[1,] 1.0000000 -0.39983276 -0.5533282 -0.2420029 0.15030025
[2,] -0.3998328 1.00000000 0.2561824 -0.2090551 -0.05073241
[3,] -0.5533282 0.25618240 1.0000000 -0.6360808 -0.90394274
[4,] -0.2420029 -0.20905508 -0.6360808 1.0000000 0.86086867
[5,] 0.1503003 -0.05073241 -0.9039427 0.8608687 1.00000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation with false diagonal")
[1] "Symbolic symnum representation with false diagonal"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),diag=FALSE)
[1,]
[2,] .
[3,] . |
[4,] | | ,
[5,] | | * +
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
> print ("COmplete symnum matrix ")
[1] "COmplete symnum matrix "
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),lower=FALSE)
[1,] 1 . . | |
[2,] . 1 | | |
[3,] . | 1 , *
[4,] | | , 1 +
[5,] | | * + 1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
说明: diag = FALSE 不打印对角元素值,即 1,表示完全相关。 LOWER = FALSE,帮助我们可视化完整的矩阵,而不仅仅是较低的对角线矩阵。