📜  如何使用 ggplot2 在 R 中绘制碎石图

📅  最后修改于: 2022-05-13 01:55:34.330000             🧑  作者: Mango

如何使用 ggplot2 在 R 中绘制碎石图

在本文中,我们将了解如何使用 ggplot2 在 R 编程语言中绘制 Scree 图。

加载数据集:

在这里我们将加载数据集,(记得去掉非数字列)。由于鸢尾花数据集包含一个字符类型的物种列,因此我们需要删除它,因为PCA仅适用于数值数据。

R
# drop the species column as its character type
num_iris = subset(iris,
                  select = -c(Species))
head(num_iris)


R
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
pca


R
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca$sdev^2 / sum(pca$sdev^2)
variance


R
library(ggplot2)
 
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca $sdev^2 / sum(pca $sdev^2)
 
# Scree plot
qplot(c(1:4), variance) +
  geom_line() +
  geom_point(size=4)+
  xlab("Principal Component") +
  ylab("Variance Explained") +
  ggtitle("Scree Plot") +
  ylim(0, 1)


R
library(ggplot2)
 
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca $sdev^2 / sum(pca $sdev^2)
 
# Scree plot
qplot(c(1:4), variance) +
  geom_col()+
  xlab("Principal Component") +
  ylab("Variance Explained") +
  ggtitle("Scree Plot") +
  ylim(0, 1)


输出:



使用prcomp ()函数计算主成分分析

我们使用 R 语言内置的 prcomp()函数,该函数将数据集作为参数并计算PCA 。主成分分析 (PCA) 是一种统计过程,它使用正交变换将一组相关变量转换为一组不相关变量。做scale=TRUE标准化数据。

代码:

电阻

# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
pca

输出:

计算每个主成分解释的方差:

我们使用下面的公式来计算每台 PC 所经历的总方差。

代码:

电阻

# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca$sdev^2 / sum(pca$sdev^2)
variance

输出:

[1] 0.729624454 0.228507618 0.036689219 0.005178709

示例 1:使用线图绘制碎石

电阻

library(ggplot2)
 
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca $sdev^2 / sum(pca $sdev^2)
 
# Scree plot
qplot(c(1:4), variance) +
  geom_line() +
  geom_point(size=4)+
  xlab("Principal Component") +
  ylab("Variance Explained") +
  ggtitle("Scree Plot") +
  ylim(0, 1)

输出:

示例 2 :使用条形图绘制碎石图

电阻

library(ggplot2)
 
# drop the species column as its character type
num_iris = subset(iris, select = -c(Species) )
 
# compute pca
pca <- prcomp(num_iris, scale = TRUE)
 
# compute total variance
variance = pca $sdev^2 / sum(pca $sdev^2)
 
# Scree plot
qplot(c(1:4), variance) +
  geom_col()+
  xlab("Principal Component") +
  ylab("Variance Explained") +
  ggtitle("Scree Plot") +
  ylim(0, 1)

输出: