📜  在R中按组汇总data.table的多列

📅  最后修改于: 2022-05-13 01:54:18.522000             🧑  作者: Mango

在R中按组汇总data.table的多列

在本文中,我们将讨论如何在 R 编程语言中按 Group 汇总 data.table 的多列。

创建用于演示的表:

R
# load data.table package
library("data.table")
  
# create data table with 3 columns
# items
# weight and #cost
data <- data.table( items= c("chocos","milk","drinks","drinks",
                             "milk","milk","chocos","milk",
                             "honey","honey"),    
                     
                   weight= c(10,20,34,23,12,45,23,
                             12,34,34),
                     
                   cost=  c(120,345,567,324,112,345,
                            678,100,45,67))
  
# display
data


R
# load data.table package
library("data.table")
  
# create data table with 3 columns
# items
# weight and #cost
data <- data.table( items= c("chocos","milk","drinks","drinks",
                             "milk","milk","chocos","milk",
                             "honey","honey"),    
                     
                   weight= c(10,20,34,23,12,45,23,
                             12,34,34),
                     
                   cost=  c(120,345,567,324,112,345,
                            678,100,45,67))
  
  
# group by sum with items column
print(data[, lapply(.SD, sum), by = items])
  
# group by average with items column
print(data[, lapply(.SD, mean), by = items])


R
# load data.table package
library("data.table")
  
# create data table with 3 columns
# items weight and #cost
data <- data.table( items= c("chocos","milk","drinks","drinks",
                             "milk","milk","chocos","milk",
                             "honey","honey"),    
                     
                   weight= c(10,20,34,23,12,45,23,
                             12,34,34),
                     
                   cost=  c(120,345,567,324,112,345,
                            678,100,45,67))
  
# group by minimum  with items column
print(data[, lapply(.SD, min), by = items])
  
# group by maximum with items column
print(data[, lapply(.SD, max), by = items])


输出:

我们可以通过 4 种方式总结多列:

  • 通过寻找平均值
  • 通过求和
  • 通过找到最小值
  • 通过找到最大值

我们可以通过使用 lapply()函数来做到这一点



例1:R程序通过求和和平均值汇总数据表

电阻

# load data.table package
library("data.table")
  
# create data table with 3 columns
# items
# weight and #cost
data <- data.table( items= c("chocos","milk","drinks","drinks",
                             "milk","milk","chocos","milk",
                             "honey","honey"),    
                     
                   weight= c(10,20,34,23,12,45,23,
                             12,34,34),
                     
                   cost=  c(120,345,567,324,112,345,
                            678,100,45,67))
  
  
# group by sum with items column
print(data[, lapply(.SD, sum), by = items])
  
# group by average with items column
print(data[, lapply(.SD, mean), by = items])

输出:

示例2:R程序按最小值和最大值汇总数据表

电阻

# load data.table package
library("data.table")
  
# create data table with 3 columns
# items weight and #cost
data <- data.table( items= c("chocos","milk","drinks","drinks",
                             "milk","milk","chocos","milk",
                             "honey","honey"),    
                     
                   weight= c(10,20,34,23,12,45,23,
                             12,34,34),
                     
                   cost=  c(120,345,567,324,112,345,
                            678,100,45,67))
  
# group by minimum  with items column
print(data[, lapply(.SD, min), by = items])
  
# group by maximum with items column
print(data[, lapply(.SD, max), by = items])

输出: