将矩阵或数据帧转换为 R 中的稀疏矩阵
稀疏矩阵采用面向列的格式,它们主要包含空值。稀疏矩阵中非空的元素按升序排列。在本文中,我们将使用 R 编程语言将矩阵和数据帧转换为稀疏矩阵。
将矩阵转换为稀疏矩阵
众所周知,R 编程语言中的矩阵是以二维布局排列的对象或元素集合。我们可以使用matrix()函数在 R 中构造一个矩阵。
我们要做的第一步是使用install.packages(“Matrix”)安装Matrix包,然后使用 R 中的库函数加载包。 接下来,我们将使用matrix()函数构建我们的矩阵由 Matrix 包提供。生成矩阵后,使用 as() 创建等效的稀疏矩阵。
Syntax :
sparsematrix <- as(BaseMatrix, “sparseMatrix”)
Parameters :
- sparsematrix : This is our sample sparse matrix which is to be converted from our base matrix.
- BaseMatrix : This is our sample R matrix.
- “sparseMatrix” : It is the category specified inside the as() function to convert the base R matrix to sparse format.
示例:将矩阵转换为 R 中的稀疏矩阵
R
# loading the Matrix package
library(Matrix)
# Constructing a base R matrix
set.seed(0)
nrows <- 6L
ncols <- 8L
values <- sample(x = c(0,1,2,3), prob = c(0.6,0.2,0.4,0.8),
size = nrows*ncols, replace = TRUE)
BaseMatrix <- matrix(values, nrow = nrows)
BaseMatrix
# For converting base matrix to sparse matrix
sparsematrix <- as(BaseMatrix, "sparseMatrix")
sparsematrix
R
library(Matrix)
# Creating a table of buyers
buyer <- data.frame(Buyers = c("Robert", "Stewart", "Kristen",
"Joe", "Kriti", "Rafel"))
buyer
# Creating a table of cars
car <- data.frame(Cars = c("Maruti", "Sedan", "SUV", "Baleno",
"Hyundai", "BMW","Audi"))
car
# Creating a table of orders: (Buyers, cars, units)
# triplets
order <- data.frame(Buyers = c("Robert", "Robert", "Stewart",
"Stewart", "Kristen", "Kristen",
"Joe", "Kriti", "Joe"),
Cars = c("Maruti", "Maruti", "BMW", "BMW",
"Audi", "Audi", "Maruti", "Audi",
"Sedan"))
# Insert the RowIndex column, identifying
# the row index to assign each buyer
order$RowIndex <- match(order$Buyers, buyer$Buyers)
# Insert the ColIndex column, identifying
# the column index to assign each car
order$ColIndex <- match(order$Cars, car$Cars)
# Now inspect
order
# Creating a basic sparse matrix where element
# (i,j) is true if buyer i bought
# car j and false, otherwise
msparse1 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex)
msparse1
# Creating another sparse matrix to make sure
# every buyer and every car appears in our matrix
# by setting the dimensions explicitly
msparse2 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex,
dims = c(nrow(buyer), nrow(car)),
dimnames = list(buyer$Buyers, car$Cars))
msparse2
# Creating another sparse matrix indicating number
# of times buyer i bought car j
msparse3 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex, x = 1L,
dims = c(nrow(buyer), nrow(car)),
dimnames = list(buyer$Buyers, car$Cars))
msparse3
输出 :
将数据帧转换为稀疏矩阵
我们知道数据框是一个表或类似二维数组的结构,它既有行又有列,是最常见的数据存储方式。我们将使用 R 中的sparseMatrix()函数将数据帧转换为稀疏矩阵。
Syntax: sparseMatrix(i = ep, j = ep, p, x, dims, dimnames, symmetric = FALSE, triangular = FALSE, index1 = TRUE, repr = “C”, giveCsparse = (repr == “C”), check = “TRUE”, use.last.ij = FALSE)
Parameters :
- i, j : These are the integers of same length that specifies the locations of row and column indices of the matrix.
- p : These are the integer vector of pointers, one for each column or row in the zero-based indexing of rows and columns.
- x : These are the optional values used in matrix entries.
- dims : These are the non-negative integer vectors.
- dimnames : These are the optional lists for ‘dimnames’.
- symmetric : This is the logical variable. If it is specified true, then the resulting matrix should be symmetric and false, otherwise.
- triangular : This is also the logical variable which gives true if the resulting matrix should be triangular and false, otherwise.
- index1 : This is the logical scalar variable. If it is true, then the counting of rows and columns starts at 1. If it is false, then the counting of rows and columns starts at 0.
- repr : These are the character strings which specifies the sparse representation used for result.
- giveCsparse : It is a logical variable indicating whether the resulting matrix is Csparse or Tsparse.
- check : It is a logical variable indicating whether a validity check is performed.
- use.last.ij : It is also logical which indicates in case of duplicate pairs, only the last one should be used.
示例:将数据帧转换为 R 中的稀疏矩阵
电阻
library(Matrix)
# Creating a table of buyers
buyer <- data.frame(Buyers = c("Robert", "Stewart", "Kristen",
"Joe", "Kriti", "Rafel"))
buyer
# Creating a table of cars
car <- data.frame(Cars = c("Maruti", "Sedan", "SUV", "Baleno",
"Hyundai", "BMW","Audi"))
car
# Creating a table of orders: (Buyers, cars, units)
# triplets
order <- data.frame(Buyers = c("Robert", "Robert", "Stewart",
"Stewart", "Kristen", "Kristen",
"Joe", "Kriti", "Joe"),
Cars = c("Maruti", "Maruti", "BMW", "BMW",
"Audi", "Audi", "Maruti", "Audi",
"Sedan"))
# Insert the RowIndex column, identifying
# the row index to assign each buyer
order$RowIndex <- match(order$Buyers, buyer$Buyers)
# Insert the ColIndex column, identifying
# the column index to assign each car
order$ColIndex <- match(order$Cars, car$Cars)
# Now inspect
order
# Creating a basic sparse matrix where element
# (i,j) is true if buyer i bought
# car j and false, otherwise
msparse1 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex)
msparse1
# Creating another sparse matrix to make sure
# every buyer and every car appears in our matrix
# by setting the dimensions explicitly
msparse2 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex,
dims = c(nrow(buyer), nrow(car)),
dimnames = list(buyer$Buyers, car$Cars))
msparse2
# Creating another sparse matrix indicating number
# of times buyer i bought car j
msparse3 <- sparseMatrix( i = order$RowIndex, j = order$ColIndex, x = 1L,
dims = c(nrow(buyer), nrow(car)),
dimnames = list(buyer$Buyers, car$Cars))
msparse3
输出 :