在 R 编程中读取文件
到目前为止,使用 R 程序的操作是在没有存储在任何地方的提示/终端上完成的。但是在软件行业,大多数程序都是为了存储从程序中获取的信息而编写的。一种这样的方法是将获取的信息存储在文件中。因此,可以对文件执行的两个最常见的操作是:
- 在 R 中导入/读取文件
- 在 R 中导出/写入文件
用 R 编程语言读取文件
当程序终止时,整个数据都会丢失。即使程序终止,存储在文件中也会保留我们的数据。如果我们必须输入大量数据,则将它们全部输入将花费大量时间。但是,如果我们有一个包含所有数据的文件,我们可以使用 R 中的一些命令轻松访问文件的内容。您可以轻松地将数据从一台计算机移动到另一台计算机,而无需进行任何更改。因此,这些文件可以以各种格式存储。它可以存储在 .txt(制表符分隔值)文件中,或以表格格式存储,即 .csv(逗号分隔值)文件,也可以存储在 Internet 或云上。 R 提供了非常简单的方法来读取这些文件。
R中的文件读取
存储文件的重要格式之一是文本文件。 R 提供了各种可以从文本文件中读取数据的方法。
- read.delim() :此方法用于读取“制表符分隔值”文件(“.txt”)。默认情况下,点 (“.”) 用作小数点。
Syntax: read.delim(file, header = TRUE, sep = “\t”, dec = “.”, …)
Parameters:
- file: the path to the file containing the data to be read into R.
- header: a logical value. If TRUE, read.delim() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- sep: the field separator character. “\t” is used for a tab-delimited file.
- dec: the character used in the file for decimal points.
例子:
R
# R program reading a text file
# Read a text file using read.delim()
myData = read.delim("geeksforgeeks.txt", header = FALSE)
print(myData)
R
# R program reading a text file
# Read a text file using read.delim2
myData = read.delim2("geeksforgeeks.txt", header = FALSE)
print(myData)
R
# R program reading a text file using file.choose()
myFile = read.delim(file.choose(), header = FALSE)
# If you use the code above in RStudio
# you will be asked to choose a file
print(myFile)
R
# R program to read text file
# using readr package
# Import the readr library
library(readr)
# Use read_tsv() to read text file
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)
R
# R program to read one line at a time
# Import the readr library
library(readr)
# read_lines() to read one line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 1)
print(myData)
# read_lines() to read two line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 2)
print(myData)
R
# R program to read the whole file
# Import the readr library
library(readr)
# read_file() to read the whole file
myData = read_file("geeksforgeeks.txt")
print(myData)
R
# R program to read a file in table format
# Using read.table()
myData = read.table("basic.csv")
print(myData)
R
# R program to read a file in table format
# Using read.csv()
myData = read.csv("basic.csv")
print(myData)
R
# R program to read a file in table format
# Using read.csv2()
myData = read.csv2("basic.csv")
print(myData)
R
# R program to read a file in table format
# Using file.choose() inside read.csv()
myData = read.csv(file.choose())
# If you use the code above in RStudio
# you will be asked to choose a file
print(myData)
R
# R program to read a file in table format
# using readr package
# Import the readr library
library(readr)
# Using read_csv() method
myData = read_csv("basic.csv", col_names = TRUE)
print(myData)
R
# R program to read a file from the internet
# Using read.delim()
myData = read.delim("http://www.sthda.com/upload/boxplot_format.txt")
print(head(myData))
输出:
1 A computer science portal for geeks.
注意:上面的 R 代码假设文件“geeksforgeeks.txt”在你当前的工作目录中。要了解您当前的工作目录,请在 R 控制台中键入函数getwd() 。
- read.delim2() :此方法用于读取“制表符分隔值”文件(“.txt”)。默认情况下,点(“,”)用作小数点。
Syntax: read.delim2(file, header = TRUE, sep = “\t”, dec = “,”, …)
Parameters:
- file: the path to the file containing the data to be read into R.
- header: a logical value. If TRUE, read.delim2() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- sep: the field separator character. “\t” is used for a tab-delimited file.
- dec: the character used in the file for decimal points.
例子:
R
# R program reading a text file
# Read a text file using read.delim2
myData = read.delim2("geeksforgeeks.txt", header = FALSE)
print(myData)
输出:
1 A computer science portal for geeks.
- file.choose() :在 R 中,也可以使用函数file.choose()以交互方式选择文件,如果您是 R 编程的初学者,那么此方法对您非常有用。
例子:
R
# R program reading a text file using file.choose()
myFile = read.delim(file.choose(), header = FALSE)
# If you use the code above in RStudio
# you will be asked to choose a file
print(myFile)
输出:
1 A computer science portal for geeks.
- read_tsv() :此方法还用于通过readr包的帮助读取制表符分隔(“\t”)的值。
Syntax: read_tsv(file, col_names = TRUE)
Parameters:
- file: the path to the file containing the data to be read into R.
- col_names: Either TRUE, FALSE, or a character vector specifying column names. If TRUE, the first row of the input will be used as the column names.
例子:
R
# R program to read text file
# using readr package
# Import the readr library
library(readr)
# Use read_tsv() to read text file
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)
输出:
# A tibble: 1 x 1
X1
1 A computer science portal for geeks.
注意:您也可以像以前一样将file.choose()与read_tsv( ) 一起使用。
# Read a txt file
myData <- read_tsv(file.choose())
一次读一行
read_lines() :此方法用于您自己选择的读取行,一次读取一行、两行或十行。要使用这种方法,我们必须导入阅读器包。
Syntax: read_lines(file, skip = 0, n_max = -1L)
Parameters:
- file: file path
- skip: Number of lines to skip before reading data
- n_max: Numbers of lines to read. If n is -1, all lines in the file will be read.
例子:
R
# R program to read one line at a time
# Import the readr library
library(readr)
# read_lines() to read one line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 1)
print(myData)
# read_lines() to read two line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 2)
print(myData)
输出:
[1] "A computer science portal for geeks."
[1] "A computer science portal for geeks."
[2] "Geeksforgeeks is founded by Sandeep Jain Sir."
读取整个文件
read_file() :此方法用于读取整个文件。要使用这种方法,我们必须导入阅读器包。
Syntax: read_lines(file)
file: the file path
例子:
R
# R program to read the whole file
# Import the readr library
library(readr)
# read_file() to read the whole file
myData = read_file("geeksforgeeks.txt")
print(myData)
输出:
[1] “A computer science portal for geeks.\r\nGeeksforgeeks is founded by Sandeep Jain Sir.\r\nI am an intern at this amazing platform.”
以表格格式读取文件
存储文件的另一种流行格式是表格格式。 R 提供了多种方法,可以从表格格式的数据文件中读取数据。
read.table() : read.table() 是一个通用函数,可用于读取表格格式的文件。数据将作为数据框导入。
Syntax: read.table(file, header = FALSE, sep = “”, dec = “.”)
Parameters:
- file: the path to the file containing the data to be imported into R.
- header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- sep: the field separator character
- dec: the character used in the file for decimal points.
例子:
R
# R program to read a file in table format
# Using read.table()
myData = read.table("basic.csv")
print(myData)
输出:
1 Name,Age,Qualification,Address
2 Amiya,18,MCA,BBS
3 Niru,23,Msc,BLS
4 Debi,23,BCA,SBP
5 Biku,56,ISC,JJP
read.csv() : read.csv() 用于读取“逗号分隔值”文件(“.csv”)。在此,数据也将作为数据框导入。
Syntax: read.csv(file, header = TRUE, sep = “,”, dec = “.”, …)
Parameters:
- file: the path to the file containing the data to be imported into R.
- header: logical value. If TRUE, read.csv() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- sep: the field separator character
- dec: the character used in the file for decimal points.
例子:
R
# R program to read a file in table format
# Using read.csv()
myData = read.csv("basic.csv")
print(myData)
输出:
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
read.csv2() : read.csv() 用于在使用逗号“,”作为小数点和分号“;”的国家/地区使用的变体作为字段分隔符。
Syntax: read.csv2(file, header = TRUE, sep = “;”, dec = “,”, …)
Parameters:
- file: the path to the file containing the data to be imported into R.
- header: logical value. If TRUE, read.csv2() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
- sep: the field separator character
- dec: the character used in the file for decimal points.
例子:
R
# R program to read a file in table format
# Using read.csv2()
myData = read.csv2("basic.csv")
print(myData)
输出:
Name.Age.Qualification.Address
1 Amiya,18,MCA,BBS
2 Niru,23,Msc,BLS
3 Debi,23,BCA,SBP
4 Biku,56,ISC,JJP
file.choose() :您也可以像以前一样将file.choose()与read.csv( ) 一起使用。
例子:
R
# R program to read a file in table format
# Using file.choose() inside read.csv()
myData = read.csv(file.choose())
# If you use the code above in RStudio
# you will be asked to choose a file
print(myData)
输出:
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
read_csv() :此方法还用于通过 readr 包的帮助读取逗号(“,”)分隔的值。
Syntax: read_csv(file, col_names = TRUE)
Parameters:
- file: the path to the file containing the data to be read into R.
- col_names: Either TRUE, FALSE, or a character vector specifying column names. If TRUE, the first row of the input will be used as the column names.
例子:
R
# R program to read a file in table format
# using readr package
# Import the readr library
library(readr)
# Using read_csv() method
myData = read_csv("basic.csv", col_names = TRUE)
print(myData)
输出:
Parsed with column specification:
cols(
Name = col_character(),
Age = col_double(),
Qualification = col_character(),
Address = col_character()
)
# A tibble: 4 x 4
Name Age Qualification Address
1 Amiya 18 MCA BBS
2 Niru 23 Msc BLS
3 Debi 23 BCA SBP
4 Biku 56 ISC JJP
从 Internet 读取文件
可以使用函数read.delim() 、 read.csv()和read.table()从网络导入文件。
例子:
R
# R program to read a file from the internet
# Using read.delim()
myData = read.delim("http://www.sthda.com/upload/boxplot_format.txt")
print(head(myData))
输出:
Nom variable Group
1 IND1 10 A
2 IND2 7 A
3 IND3 20 A
4 IND4 14 A
5 IND5 14 A
6 IND6 12 A