在 R 编程中读取文件

到目前为止，使用 R 程序的操作是在没有存储在任何地方的提示/终端上完成的。但是在软件行业，大多数程序都是为了存储从程序中获取的信息而编写的。一种这样的方法是将获取的信息存储在文件中。因此，可以对文件执行的两个最常见的操作是：

在 R 中导入/读取文件
在 R 中导出/写入文件

用 R 编程语言读取文件

当程序终止时，整个数据都会丢失。即使程序终止，存储在文件中也会保留我们的数据。如果我们必须输入大量数据，则将它们全部输入将花费大量时间。但是，如果我们有一个包含所有数据的文件，我们可以使用 R 中的一些命令轻松访问文件的内容。您可以轻松地将数据从一台计算机移动到另一台计算机，而无需进行任何更改。因此，这些文件可以以各种格式存储。它可以存储在 .txt（制表符分隔值）文件中，或以表格格式存储，即 .csv（逗号分隔值）文件，也可以存储在 Internet 或云上。 R 提供了非常简单的方法来读取这些文件。

R中的文件读取

存储文件的重要格式之一是文本文件。 R 提供了各种可以从文本文件中读取数据的方法。

read.delim() ：此方法用于读取“制表符分隔值”文件（“.txt”）。默认情况下，点 (“.”) 用作小数点。

Syntax: read.delim(file, header = TRUE, sep = “\t”, dec = “.”, …)

Parameters:

file: the path to the file containing the data to be read into R.
header: a logical value. If TRUE, read.delim() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
sep: the field separator character. “\t” is used for a tab-delimited file.
dec: the character used in the file for decimal points.

编程需要懂一点英语

例子：

R

# R program reading a text file
 
# Read a text file using read.delim()
myData = read.delim("geeksforgeeks.txt", header = FALSE)
print(myData)

R

# R program reading a text file
 
# Read a text file using read.delim2
myData = read.delim2("geeksforgeeks.txt", header = FALSE)
print(myData)

R

# R program reading a text file using file.choose()
 
myFile = read.delim(file.choose(), header = FALSE)
# If you use the code above in RStudio
# you will be asked to choose a file
print(myFile)

R

# R program to read text file
# using readr package
 
# Import the readr library
library(readr)
 
# Use read_tsv() to read text file
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)

R

# R program to read one line at a time
 
# Import the readr library
library(readr)
 
# read_lines() to read one line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 1)
print(myData)
 
# read_lines() to read two line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 2)
print(myData)

R

# R program to read the whole file
 
# Import the readr library
library(readr)
 
# read_file() to read the whole file
myData = read_file("geeksforgeeks.txt")
print(myData)

R

# R program to read a file in table format
 
# Using read.table()
myData = read.table("basic.csv")
print(myData)

R

# R program to read a file in table format
 
# Using read.csv()
myData = read.csv("basic.csv")
print(myData)

R

# R program to read a file in table format
 
# Using read.csv2()
myData = read.csv2("basic.csv")
print(myData)

R

# R program to read a file in table format
 
# Using file.choose() inside read.csv()
myData = read.csv(file.choose())
# If you use the code above in RStudio
# you will be asked to choose a file
print(myData)

R

# R program to read a file in table format
# using readr package
 
# Import the readr library
library(readr)
 
# Using read_csv() method
myData = read_csv("basic.csv", col_names = TRUE)
print(myData)

R

# R program to read a file from the internet
 
# Using read.delim()
myData = read.delim("http://www.sthda.com/upload/boxplot_format.txt")
print(head(myData))

输出：

1 A computer science portal for geeks.

注意：上面的 R 代码假设文件“geeksforgeeks.txt”在你当前的工作目录中。要了解您当前的工作目录，请在 R 控制台中键入函数getwd() 。

read.delim2() ：此方法用于读取“制表符分隔值”文件（“.txt”）。默认情况下，点（“，”）用作小数点。

Syntax: read.delim2(file, header = TRUE, sep = “\t”, dec = “,”, …)

Parameters:

file: the path to the file containing the data to be read into R.
header: a logical value. If TRUE, read.delim2() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
sep: the field separator character. “\t” is used for a tab-delimited file.
dec: the character used in the file for decimal points.

编程需要懂一点英语

例子：

R

# R program reading a text file
 
# Read a text file using read.delim2
myData = read.delim2("geeksforgeeks.txt", header = FALSE)
print(myData)

输出：

1 A computer science portal for geeks.

file.choose() ：在 R 中，也可以使用函数file.choose()以交互方式选择文件，如果您是 R 编程的初学者，那么此方法对您非常有用。

例子：

R

# R program reading a text file using file.choose()
 
myFile = read.delim(file.choose(), header = FALSE)
# If you use the code above in RStudio
# you will be asked to choose a file
print(myFile)

输出：

1 A computer science portal for geeks.

read_tsv() ：此方法还用于通过readr包的帮助读取制表符分隔（“\t”）的值。

Syntax: read_tsv(file, col_names = TRUE)

Parameters:

file: the path to the file containing the data to be read into R.
col_names: Either TRUE, FALSE, or a character vector specifying column names. If TRUE, the first row of the input will be used as the column names.

编程需要懂一点英语

例子：

R

# R program to read text file
# using readr package
 
# Import the readr library
library(readr)
 
# Use read_tsv() to read text file
myData = read_tsv("geeksforgeeks.txt", col_names = FALSE)
print(myData)

输出：

# A tibble: 1 x 1
  X1                                  
                                 
1 A computer science portal for geeks.

注意：您也可以像以前一样将file.choose()与read_tsv( ) 一起使用。

# Read a txt file
myData <- read_tsv(file.choose())

一次读一行

read_lines() ：此方法用于您自己选择的读取行，一次读取一行、两行或十行。要使用这种方法，我们必须导入阅读器包。

Syntax: read_lines(file, skip = 0, n_max = -1L)

Parameters:

file: file path
skip: Number of lines to skip before reading data
n_max: Numbers of lines to read. If n is -1, all lines in the file will be read.

编程需要懂一点英语

例子：

R

# R program to read one line at a time
 
# Import the readr library
library(readr)
 
# read_lines() to read one line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 1)
print(myData)
 
# read_lines() to read two line at a time
myData = read_lines("geeksforgeeks.txt", n_max = 2)
print(myData)

输出：

[1] "A computer science portal for geeks."

[1] "A computer science portal for geeks."         
[2] "Geeksforgeeks is founded by Sandeep Jain Sir."

读取整个文件

read_file() ：此方法用于读取整个文件。要使用这种方法，我们必须导入阅读器包。

Syntax: read_lines(file)

file: the file path

例子：

R

# R program to read the whole file
 
# Import the readr library
library(readr)
 
# read_file() to read the whole file
myData = read_file("geeksforgeeks.txt")
print(myData)

输出：

[1] “A computer science portal for geeks.\r\nGeeksforgeeks is founded by Sandeep Jain Sir.\r\nI am an intern at this amazing platform.”

编程需要懂一点英语

以表格格式读取文件

存储文件的另一种流行格式是表格格式。 R 提供了多种方法，可以从表格格式的数据文件中读取数据。

read.table() : read.table() 是一个通用函数，可用于读取表格格式的文件。数据将作为数据框导入。

Syntax: read.table(file, header = FALSE, sep = “”, dec = “.”)

Parameters:

file: the path to the file containing the data to be imported into R.
header: logical value. If TRUE, read.table() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.

编程需要懂一点英语

例子：

R

# R program to read a file in table format
 
# Using read.table()
myData = read.table("basic.csv")
print(myData)

输出：

1 Name,Age,Qualification,Address
2 Amiya,18,MCA,BBS
3 Niru,23,Msc,BLS
4 Debi,23,BCA,SBP
5 Biku,56,ISC,JJP

read.csv() : read.csv() 用于读取“逗号分隔值”文件（“.csv”）。在此，数据也将作为数据框导入。

Syntax: read.csv(file, header = TRUE, sep = “,”, dec = “.”, …)

Parameters:

file: the path to the file containing the data to be imported into R.
header: logical value. If TRUE, read.csv() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.

编程需要懂一点英语

例子：

R

# R program to read a file in table format
 
# Using read.csv()
myData = read.csv("basic.csv")
print(myData)

输出：

Name Age Qualification Address
1 Amiya  18           MCA     BBS
2  Niru  23           Msc     BLS
3  Debi  23           BCA     SBP
4  Biku  56           ISC     JJP

read.csv2() : read.csv() 用于在使用逗号“，”作为小数点和分号“;”的国家/地区使用的变体作为字段分隔符。

Syntax: read.csv2(file, header = TRUE, sep = “;”, dec = “,”, …)

Parameters:

file: the path to the file containing the data to be imported into R.
header: logical value. If TRUE, read.csv2() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.
sep: the field separator character
dec: the character used in the file for decimal points.

编程需要懂一点英语

例子：

R

# R program to read a file in table format
 
# Using read.csv2()
myData = read.csv2("basic.csv")
print(myData)

输出：

Name.Age.Qualification.Address
1               Amiya,18,MCA,BBS
2                Niru,23,Msc,BLS
3                Debi,23,BCA,SBP
4                Biku,56,ISC,JJP

file.choose() ：您也可以像以前一样将file.choose()与read.csv( ) 一起使用。

例子：

R

# R program to read a file in table format
 
# Using file.choose() inside read.csv()
myData = read.csv(file.choose())
# If you use the code above in RStudio
# you will be asked to choose a file
print(myData)

输出：

Name Age Qualification Address
1 Amiya  18           MCA     BBS
2  Niru  23           Msc     BLS
3  Debi  23           BCA     SBP
4  Biku  56           ISC     JJP

read_csv() ：此方法还用于通过 readr 包的帮助读取逗号（“，”）分隔的值。

Syntax: read_csv(file, col_names = TRUE)

Parameters:

file: the path to the file containing the data to be read into R.
col_names: Either TRUE, FALSE, or a character vector specifying column names. If TRUE, the first row of the input will be used as the column names.

编程需要懂一点英语

例子：

R

# R program to read a file in table format
# using readr package
 
# Import the readr library
library(readr)
 
# Using read_csv() method
myData = read_csv("basic.csv", col_names = TRUE)
print(myData)

输出：

Parsed with column specification:
cols(
  Name = col_character(),
  Age = col_double(),
  Qualification = col_character(),
  Address = col_character()
)
# A tibble: 4 x 4
  Name    Age Qualification Address
               
1 Amiya    18 MCA           BBS    
2 Niru     23 Msc           BLS    
3 Debi     23 BCA           SBP    
4 Biku     56 ISC           JJP

从 Internet 读取文件

可以使用函数read.delim() 、 read.csv()和read.table()从网络导入文件。

例子：

R

# R program to read a file from the internet
 
# Using read.delim()
myData = read.delim("http://www.sthda.com/upload/boxplot_format.txt")
print(head(myData))

输出：

Nom variable Group
1 IND1       10     A
2 IND2        7     A
3 IND3       20     A
4 IND4       14     A
5 IND5       14     A
6 IND6       12     A