📜  在 R 编程中使用符号函数可视化相关矩阵

📅  最后修改于: 2022-05-13 01:55:00.431000             🧑  作者: Mango

在 R 编程中使用符号函数可视化相关矩阵

相关性是指两个变量之间的关系。它是指任意两个随机变量之间线性相关的程度。这种关系可以表示为区间 [-1, 1] 内表示的值范围。值 -1 表示完美的非线性(负)关系,1 是完美的正线性关系,0 是介于正负线性相互依赖性之间的中间值。但是,值为 0 并不表示变量是独立的。相关矩阵计算一组随机变量之间的线性关系程度,一次考虑一对。

相关矩阵的性质

  1. 相关矩阵的所有对角元素必须为 1,因为变量与其自身的相关性总是完美的,例如 C ii = 1
  2. 它应该是对称的,例如 Cij = C ji

在 R 中的实现

R 有一个内置函数symnum() ,它可以用来轻松地可视化各种变量之间的校正程度。它可以很容易地将高度相关的变量与其他变量分开。相关系数根据相关程度由符号代替。在 R 中, symnum()函数具有以下语法:

一维数值数组的可视化:

R
# defining a single dimension array
arr <- c(6, 4, 3, 2, 5, 1, 8, 7)
  
# cut values are determined at an interval of 2
# symbols are specified by sym 
symnum(arr, cut = c(0, 2, 4, 6, 8), 
       sym = c(".", "-", "+", "$"))


R
# the logical condition is a parameter 
# to the symnum function 
# the default values assigned are | symbol 
# for true values and . for false values 
symnum(1:7 %% 2 == 0)


R
# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rnorm(30), 10, 3));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various elements 
print("Symbolic symnum representation")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))


R
# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rexp(30, 1), 5, 5));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various 
# elements without diagonal elements
print("Symbolic symnum representation with false diagonal")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"), 
       diag = FALSE)
  
# setting lower = false 
print("COmplete symnum matrix ")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
       lower = FALSE)


输出:

[1] + - - . + . $ $
attr(,"legend")
[1] 0 '.' 2 '-' 4 '+' 6 '$' 8

说明: 0-2 范围内的所有值,包括 2,都用“.”表示,同样在 6-8 范围内,用“$”表示。因此,输出指示基于切点和符号的 arr 值的表示。



一维逻辑数组的可视化
以下代码片段指示了 symnum()函数在逻辑数组上的应用。

电阻

# the logical condition is a parameter 
# to the symnum function 
# the default values assigned are | symbol 
# for true values and . for false values 
symnum(1:7 %% 2 == 0)

输出:

[1] . | . | . | .

说明:根据数组 %2 的条件计算数组值,并将相应的结果计算为一个符号,“1”基于 TRUE 和“.”。基于 FALSE。

在 R 中可视化相关矩阵

cor()函数可以很容易地创建相关矩阵。

该函数输出一个相关系数矩阵,然后可以将其输入symnum()函数以关注高度相关的值,这些值来自函数的符号数组参数中指定的符号。

电阻

# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rnorm(30), 10, 3));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various elements 
print("Symbolic symnum representation")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))

输出:

[1] "Correlation matrix"
> print (mat)
         [,1]      [,2]      [,3]
[1,] 1.0000000 0.1295918 0.1137502
[2,] 0.1295918 1.0000000 0.2967970
[3,] 0.1137502 0.2967970 1.0000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation")
[1] "Symbolic symnum representation"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))
           
[1,] 1      
[2,] |  1  
[3,] |  |  1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1

说明:默认打印下对角矩阵,其中值表示描述关系程度的符号。

symnum()函数中可以更改各种参数。以下代码片段指示了 params 的用法:

电阻

# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rexp(30, 1), 5, 5));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various 
# elements without diagonal elements
print("Symbolic symnum representation with false diagonal")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"), 
       diag = FALSE)
  
# setting lower = false 
print("COmplete symnum matrix ")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
       lower = FALSE)

输出:

[1] "Correlation matrix"
> print (mat)
          [,1]        [,2]       [,3]       [,4]        [,5]
[1,]  1.0000000 -0.39983276 -0.5533282 -0.2420029  0.15030025
[2,] -0.3998328  1.00000000  0.2561824 -0.2090551 -0.05073241
[3,] -0.5533282  0.25618240  1.0000000 -0.6360808 -0.90394274
[4,] -0.2420029 -0.20905508 -0.6360808  1.0000000  0.86086867
[5,]  0.1503003 -0.05073241 -0.9039427  0.8608687  1.00000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation with false diagonal")
[1] "Symbolic symnum representation with false diagonal"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),diag=FALSE)
             
[1,]          
[2,] .        
[3,] .  |      
[4,] |  |  ,  
[5,] |  |  * +
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
> print ("COmplete symnum matrix ")
[1] "COmplete symnum matrix "
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),lower=FALSE)
                 
[1,] 1  .  .  |  |
[2,] .  1  |  |  |
[3,] .  |  1  ,  *
[4,] |  |  ,  1  +
[5,] |  |  *  +  1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1

说明: diag = FALSE 不打印对角元素值,即 1,表示完全相关。 LOWER = FALSE,帮助我们可视化完整的矩阵,而不仅仅是较低的对角线矩阵。