📜  R 编程中的数据结构

📅  最后修改于: 2022-05-13 01:55:24.457000             🧑  作者: Mango

R 编程中的数据结构

数据结构是一种在计算机中组织数据的特殊方式,以便可以有效地使用它。这个想法是减少不同任务的空间和时间复杂性。 R 编程中的数据结构是用于保存多个值的工具。

R 的基本数据结构通常按其维度(1D、2D 或 nD)以及它们是同质的(所有元素必须是相同类型)还是异构的(元素通常是各种类型)来组织的。这产生了在数据分析中最常使用的六种数据类型。

R 中使用的最基本的数据结构包括:

  • 矢量图
  • 列表
  • 数据框
  • 矩阵
  • 数组
  • 因素

矢量图

向量是给定长度的基本数据类型的有序集合。这里唯一的关键是向量的所有元素必须是相同的数据类型,例如同构数据结构。向量是一维数据结构。

例子:

Python3
# R program to illustrate Vector
 
# Vectors(ordered collection of same data type)
X = c(1, 3, 5, 7, 8)
 
# Printing those elements in console
print(X)


Python3
# R program to illustrate a List
 
# The first attributes is a numeric vector
# containing the employee IDs which is
# created using the 'c' command here
empId = c(1, 2, 3, 4)
 
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Debi", "Sandeep", "Subham", "Shiba")
 
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
 
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp)
 
print(empList)


Python3
# R program to illustrate dataframe
 
# A vector which is a character vector
Name = c("Amiya", "Raj", "Asish")
 
# A vector which is a character vector
Language = c("R", "Python", "Java")
 
# A vector which is a numeric vector
Age = c(22, 25, 45)
 
# To create dataframe use data.frame command
# and then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)
 
print(df)


Python3
# R program to illustrate a matrix
 
A = matrix(
    # Taking sequence of elements
    c(1, 2, 3, 4, 5, 6, 7, 8, 9),
     
    # No of rows and columns
    nrow = 3, ncol = 3, 
 
    # By default matrices are
    # in column-wise order
    # So this parameter decides
    # how to arrange the matrix         
    byrow = TRUE                            
)
 
print(A)


Python3
# R program to illustrate an array
 
A = array(
    # Taking sequence of elements
    c(1, 2, 3, 4, 5, 6, 7, 8),
 
    # Creating two rectangular matrices
    # each with two rows and two columns
    dim = c(2, 2, 2)                       
)
 
print(A)


Python3
# R program to illustrate factors
 
# Creating factor using factor()
fac = factor(c("Male", "Female", "Male",
               "Male", "Female", "Male", "Female"))
 
print(fac)


输出:

[1] 1 3 5 7 8

列表

列表是由对象的有序集合组成的通用对象。列表是异构数据结构。这些也是一维数据结构。列表可以是向量列表、矩阵列表、字符列表和函数列表等。

例子:

Python3

# R program to illustrate a List
 
# The first attributes is a numeric vector
# containing the employee IDs which is
# created using the 'c' command here
empId = c(1, 2, 3, 4)
 
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Debi", "Sandeep", "Subham", "Shiba")
 
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
 
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp)
 
print(empList)

输出:

[[1]]
[1] 1 2 3 4

[[2]]
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

[[3]]
[1] 4

数据框

数据框是 R 的通用数据对象,用于存储表格数据。数据框是 R 编程中最流行的数据对象,因为我们很乐意在表格形式中查看数据。它们是二维的异构数据结构。这些是等长向量的列表。

数据框具有以下约束:

  • 数据框必须有列名,并且每一行都应该有一个唯一的名称。
  • 每列必须具有相同数量的项目。
  • 单个列中的每个项目必须具有相同的数据类型。
  • 不同的列可能有不同的数据类型。

要创建数据框,我们使用 data.frame()函数。

例子:

Python3

# R program to illustrate dataframe
 
# A vector which is a character vector
Name = c("Amiya", "Raj", "Asish")
 
# A vector which is a character vector
Language = c("R", "Python", "Java")
 
# A vector which is a numeric vector
Age = c(22, 25, 45)
 
# To create dataframe use data.frame command
# and then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)
 
print(df)

输出:

Name Language Age
1 Amiya        R  22
2   Raj   Python  25
3 Asish     Java  45

矩阵

矩阵是行和列中数字的矩形排列。在矩阵中,我们知道行是水平运行的,列是垂直运行的。矩阵是二维的同质数据结构。
现在,让我们看看如何在 R 中创建矩阵。要在 R 中创建矩阵,您需要使用名为 matrix 的函数。这个 matrix() 的参数是向量中的元素集。您必须传递矩阵中的行数和列数,这是您必须记住的重要一点,默认情况下,矩阵按列顺序排列。

例子:

Python3

# R program to illustrate a matrix
 
A = matrix(
    # Taking sequence of elements
    c(1, 2, 3, 4, 5, 6, 7, 8, 9),
     
    # No of rows and columns
    nrow = 3, ncol = 3, 
 
    # By default matrices are
    # in column-wise order
    # So this parameter decides
    # how to arrange the matrix         
    byrow = TRUE                            
)
 
print(A)

输出:

[,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

数组

数组是存储二维以上数据的 R 数据对象。数组是 n 维数据结构。例如,如果我们创建一个维度为 (2, 3, 3) 的数组,那么它会创建 3 个矩形矩阵,每个矩阵有 2 行和 3 列。它们是同构数据结构。

现在,让我们看看如何在 R 中创建数组。要在 R 中创建数组,您需要使用名为 array() 的函数。这个 array() 的参数是向量中的元素集,你必须传递一个包含数组维度的向量。

例子:

Python3

# R program to illustrate an array
 
A = array(
    # Taking sequence of elements
    c(1, 2, 3, 4, 5, 6, 7, 8),
 
    # Creating two rectangular matrices
    # each with two rows and two columns
    dim = c(2, 2, 2)                       
)
 
print(A)

输出:

, , 1

     [,1] [,2]
[1,]    1    3
[2,]    2    4

, , 2

     [,1] [,2]
[1,]    5    7
[2,]    6    8

因素

因子是用于对数据进行分类并将其存储为级别的数据对象。它们对于存储分类数据很有用。它们可以存储字符串和整数。它们可用于对“TRUE”或“FALSE”、“MALE”或“FEMALE”等列中的唯一值进行分类。它们可用于统计建模的数据分析。

现在,让我们看看如何在 R 中创建因子。要在 R 中创建因子,您需要使用名为 factor() 的函数。这个 factor() 的参数是向量。

例子:

Python3

# R program to illustrate factors
 
# Creating factor using factor()
fac = factor(c("Male", "Female", "Male",
               "Male", "Female", "Male", "Female"))
 
print(fac)

输出:

[1] Male   Female Male   Male   Female Male   Female
Levels: Female Male