R 编程中的数据结构
数据结构是一种在计算机中组织数据的特殊方式,以便可以有效地使用它。这个想法是减少不同任务的空间和时间复杂性。 R 编程中的数据结构是用于保存多个值的工具。
R 的基本数据结构通常按其维度(1D、2D 或 nD)以及它们是同质的(所有元素必须是相同类型)还是异构的(元素通常是各种类型)来组织的。这产生了在数据分析中最常使用的六种数据类型。
R 中使用的最基本的数据结构包括:
- 矢量图
- 列表
- 数据框
- 矩阵
- 数组
- 因素
矢量图
向量是给定长度的基本数据类型的有序集合。这里唯一的关键是向量的所有元素必须是相同的数据类型,例如同构数据结构。向量是一维数据结构。
例子:
Python3
# R program to illustrate Vector
# Vectors(ordered collection of same data type)
X = c(1, 3, 5, 7, 8)
# Printing those elements in console
print(X)
Python3
# R program to illustrate a List
# The first attributes is a numeric vector
# containing the employee IDs which is
# created using the 'c' command here
empId = c(1, 2, 3, 4)
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Debi", "Sandeep", "Subham", "Shiba")
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp)
print(empList)
Python3
# R program to illustrate dataframe
# A vector which is a character vector
Name = c("Amiya", "Raj", "Asish")
# A vector which is a character vector
Language = c("R", "Python", "Java")
# A vector which is a numeric vector
Age = c(22, 25, 45)
# To create dataframe use data.frame command
# and then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)
print(df)
Python3
# R program to illustrate a matrix
A = matrix(
# Taking sequence of elements
c(1, 2, 3, 4, 5, 6, 7, 8, 9),
# No of rows and columns
nrow = 3, ncol = 3,
# By default matrices are
# in column-wise order
# So this parameter decides
# how to arrange the matrix
byrow = TRUE
)
print(A)
Python3
# R program to illustrate an array
A = array(
# Taking sequence of elements
c(1, 2, 3, 4, 5, 6, 7, 8),
# Creating two rectangular matrices
# each with two rows and two columns
dim = c(2, 2, 2)
)
print(A)
Python3
# R program to illustrate factors
# Creating factor using factor()
fac = factor(c("Male", "Female", "Male",
"Male", "Female", "Male", "Female"))
print(fac)
输出:
[1] 1 3 5 7 8
列表
列表是由对象的有序集合组成的通用对象。列表是异构数据结构。这些也是一维数据结构。列表可以是向量列表、矩阵列表、字符列表和函数列表等。
例子:
Python3
# R program to illustrate a List
# The first attributes is a numeric vector
# containing the employee IDs which is
# created using the 'c' command here
empId = c(1, 2, 3, 4)
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Debi", "Sandeep", "Subham", "Shiba")
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp)
print(empList)
输出:
[[1]]
[1] 1 2 3 4
[[2]]
[1] "Debi" "Sandeep" "Subham" "Shiba"
[[3]]
[1] 4
数据框
数据框是 R 的通用数据对象,用于存储表格数据。数据框是 R 编程中最流行的数据对象,因为我们很乐意在表格形式中查看数据。它们是二维的异构数据结构。这些是等长向量的列表。
数据框具有以下约束:
- 数据框必须有列名,并且每一行都应该有一个唯一的名称。
- 每列必须具有相同数量的项目。
- 单个列中的每个项目必须具有相同的数据类型。
- 不同的列可能有不同的数据类型。
要创建数据框,我们使用 data.frame()函数。
例子:
Python3
# R program to illustrate dataframe
# A vector which is a character vector
Name = c("Amiya", "Raj", "Asish")
# A vector which is a character vector
Language = c("R", "Python", "Java")
# A vector which is a numeric vector
Age = c(22, 25, 45)
# To create dataframe use data.frame command
# and then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)
print(df)
输出:
Name Language Age
1 Amiya R 22
2 Raj Python 25
3 Asish Java 45
矩阵
矩阵是行和列中数字的矩形排列。在矩阵中,我们知道行是水平运行的,列是垂直运行的。矩阵是二维的同质数据结构。
现在,让我们看看如何在 R 中创建矩阵。要在 R 中创建矩阵,您需要使用名为 matrix 的函数。这个 matrix() 的参数是向量中的元素集。您必须传递矩阵中的行数和列数,这是您必须记住的重要一点,默认情况下,矩阵按列顺序排列。
例子:
Python3
# R program to illustrate a matrix
A = matrix(
# Taking sequence of elements
c(1, 2, 3, 4, 5, 6, 7, 8, 9),
# No of rows and columns
nrow = 3, ncol = 3,
# By default matrices are
# in column-wise order
# So this parameter decides
# how to arrange the matrix
byrow = TRUE
)
print(A)
输出:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
数组
数组是存储二维以上数据的 R 数据对象。数组是 n 维数据结构。例如,如果我们创建一个维度为 (2, 3, 3) 的数组,那么它会创建 3 个矩形矩阵,每个矩阵有 2 行和 3 列。它们是同构数据结构。
现在,让我们看看如何在 R 中创建数组。要在 R 中创建数组,您需要使用名为 array() 的函数。这个 array() 的参数是向量中的元素集,你必须传递一个包含数组维度的向量。
例子:
Python3
# R program to illustrate an array
A = array(
# Taking sequence of elements
c(1, 2, 3, 4, 5, 6, 7, 8),
# Creating two rectangular matrices
# each with two rows and two columns
dim = c(2, 2, 2)
)
print(A)
输出:
, , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 6 8
因素
因子是用于对数据进行分类并将其存储为级别的数据对象。它们对于存储分类数据很有用。它们可以存储字符串和整数。它们可用于对“TRUE”或“FALSE”、“MALE”或“FEMALE”等列中的唯一值进行分类。它们可用于统计建模的数据分析。
现在,让我们看看如何在 R 中创建因子。要在 R 中创建因子,您需要使用名为 factor() 的函数。这个 factor() 的参数是向量。
例子:
Python3
# R program to illustrate factors
# Creating factor using factor()
fac = factor(c("Male", "Female", "Male",
"Male", "Female", "Male", "Female"))
print(fac)
输出:
[1] Male Female Male Male Female Male Female
Levels: Female Male