如何在 R 中创建数据透视表?
在本文中,我们将讨论如何在 R 编程语言中创建数据透视表。
数据透视表是 Microsoft Excel 最强大的功能之一,可让我们从庞大而详细的数据集中提取重要性。数据透视表通常通过将列中的一些值分组在一起来显示有关数据集的一些统计值,为此,在 R 编程语言中,我们使用 dplyr 包库的 group_by() 和 summarise()函数。 R 编程语言中的 dplyr 包是一种数据操作结构,它提供了一组统一的动词,帮助我们预处理大数据。 group_by()函数使用一个或多个变量对数据进行分组,然后汇总函数使用传递给它的聚合函数按这些组创建数据摘要。
Syntax:
df %>% group_by( grouping_variables) %>% summarize( label = aggregate_fun() )
Parameter:
- df: determines the data frame in use.
- grouping_variables: determine the variable used to group data.
- aggregate_fun(): determines the function used for summary. for example, sum, mean, etc.
示例 1:创建数据透视表
R
# create sample data frame
sample_data <- data.frame(label=c('Geek1', 'Geek2', 'Geek3', 'Geek1',
'Geek2', 'Geek3', 'Geek1', 'Geek2',
'Geek3'),
value=c(222, 18, 51, 52, 44, 19, 100, 98, 34))
# load library dplyr
library(dplyr)
# create pivot table with sum of value as summary
sample_data %>% group_by(label) %>%
summarize(sum_values = sum(value))
R
# create sample data frame
sample_data <- data.frame(label=c('Geek1', 'Geek2', 'Geek3', 'Geek1',
'Geek2', 'Geek3', 'Geek1', 'Geek2',
'Geek3'),
value=c(222, 18, 51, 52, 44, 19, 100, 98, 34))
# load library dplyr
library(dplyr)
# create pivot table with sum of value as summary
sample_data %>% group_by(label) %>% summarize(average_values = mean(value))
输出:
# A tibble: 3 x 2
label sum_values
1 Geek1 374
2 Geek2 160
3 Geek3 104
示例 2:创建数据透视表
R
# create sample data frame
sample_data <- data.frame(label=c('Geek1', 'Geek2', 'Geek3', 'Geek1',
'Geek2', 'Geek3', 'Geek1', 'Geek2',
'Geek3'),
value=c(222, 18, 51, 52, 44, 19, 100, 98, 34))
# load library dplyr
library(dplyr)
# create pivot table with sum of value as summary
sample_data %>% group_by(label) %>% summarize(average_values = mean(value))
输出:
# A tibble: 3 x 2
label average_values
1 Geek1 125.
2 Geek2 53.3
3 Geek3 34.7