📅  最后修改于: 2023-12-03 15:09:09.607000             🧑  作者: Mango
在 R 中,通过筛选数据,使得我们更好的理解和分析数据。过滤数据在数据分析和数据挖掘中非常重要。使用 dplyr
包可以很方便的按列中的值过滤 R 数据框。
# 安装依赖库
install.packages("dplyr")
# 加载依赖库
library(dplyr)
示例数据框如下:
| Name | Age | Gender | City | | ---- | --- | ------ | ---- | | Tom | 25 | Male | Beijing | | Jerry | 30 | Male | Shanghai | | Alice | 20 | Female | Beijing | | Bob | 28 | Male | Guangzhou | | Cathy | 22 | Female | Shanghai | | David | 24 | Male | Beijing |
df <- data.frame(
Name=c("Tom", "Jerry", "Alice", "Bob", "Cathy", "David"),
Age=c(25, 30, 20, 28, 22, 24),
Gender=c("Male", "Male", "Female", "Male", "Female", "Male"),
City=c("Beijing", "Shanghai", "Beijing", "Guangzhou", "Shanghai", "Beijing")
)
使用 filter()
函数过滤数据框的行。
# 过滤年龄大于等于25岁的人
df %>% filter(Age >= 25)
输出结果:
| Name | Age | Gender | City | | ---- | --- | ------ | ---- | | Tom | 25 | Male | Beijing | | Jerry | 30 | Male | Shanghai | | Bob | 28 | Male | Guangzhou | | David | 24 | Male | Beijing |
# 过滤居住在北京的人
df %>% filter(City == "Beijing")
输出结果:
| Name | Age | Gender | City | | ------| --- | ------ | ------- | | Tom | 25 | Male | Beijing | | Alice | 20 | Female | Beijing | | David | 24 | Male | Beijing |
# 过滤男性,在上海或广州居住的人
df %>% filter(Gender == "Male" & (City == "Shanghai" | City == "Guangzhou"))
输出结果:
| Name | Age | Gender | City | | -----| --- | ------ | -------- | | Jerry | 30 | Male | Shanghai | | Bob | 28 | Male | Guangzhou |
dplyr
包提供了一个简单的方法来过滤 R 数据框。使用 filter()
函数可以轻松地按列中的值过滤 R 数据框。