📜  r code mutate (1)

📅  最后修改于: 2023-12-03 15:04:45.084000             🧑  作者: Mango

R Code Mutate

Introduction

In R, the mutate() function is used to create new variables or modify existing variables in a dataset. It is a part of the dplyr package, which provides a collection of functions for data manipulation.

With mutate(), programmers can perform a variety of operations on variables, such as mathematical calculations, string manipulations, and logical operations.

Syntax
mutate(data, new_variable = expression)
  • data: The input dataset.
  • new_variable: The name of the new variable to be created.
  • expression: The expression used to compute the values of the new variable.
Examples
Create a new variable
library(dplyr)

# Create a dataset
data <- data.frame(id = 1:5, value = c(3, 5, 2, 6, 1))

# Add a new variable 'value2' as a square of 'value'
data_new <- mutate(data, value2 = value^2)

The above code creates a new variable value2 in the data dataset with the squares of the values in the value column.

Modify an existing variable
# Modify the 'value' variable by adding 2 to each value
data_new <- mutate(data, value = value + 2)

The above code modifies the values in the value column of the data dataset by adding 2 to each value.

Create a new variable using multiple variables
# Create a new variable 'average' as the average of 'value' and 'value2'
data_new <- mutate(data, average = (value + value2)/2)

The above code creates a new variable average in the data dataset as the average of the values in the value and value2 columns.

Create a new variable using conditional statements
# Create a new variable 'status' based on 'value' (greater than 3 = 'pass', else 'fail')
data_new <- mutate(data, status = ifelse(value > 3, 'pass', 'fail'))

The above code creates a new variable status in the data dataset based on conditional statements on the values in the value column.

Conclusion

The mutate() function is a powerful tool for data manipulation in R. With its wide range of capabilities, it makes it possible to create new variables, modify existing variables, perform calculations, and make use of conditional statements. Its ease of use and integration with other functions in the dplyr package make it an essential component of any data analysis project in R.