📅  最后修改于: 2023-12-03 14:46:51.843000             🧑  作者: Mango
r mutate()
FunctionThe r mutate()
function is a powerful tool in the dplyr
package of R
that allows you to create new variables or modify existing variables in a dataframe. It is a highly versatile function for data manipulation and transformation. In this introduction, we will explore the various features and functions that r mutate()
offers.
The general syntax of r mutate()
is as follows:
new_dataframe <- mutate(dataframe, new_variable = expression)
Parameters:
dataframe
: The dataframe on which you want to perform the mutation.new_variable
: The name of the new variable you want to create.expression
: The expression that defines the value of the new variable.The r mutate()
function creates a new dataframe new_dataframe
by taking the existing dataframe and adding a new variable new_variable
. The expression defines how the value of new_variable
is calculated based on the existing variables in the dataframe.
For example, consider a dataframe df
with variables x
and y
. We can create a new variable z
using r mutate()
as follows:
library(dplyr)
new_df <- mutate(df, z = x + y)
This will create a new dataframe new_df
with an additional variable z
, which is the sum of variables x
and y
from the original dataframe.
Advanced Usage:
The r mutate()
function allows for more complex manipulations using various functions and operations. These include:
+
, -
, *
, /
, ^
log()
, sqrt()
, mean()
, sum()
, min()
, max()
, etc.<
, >
, ==
, !=
, &
, |
, ifelse()
, etc.stringr
package functions like str_extract()
, str_replace()
, etc.group_by
in combination with mutate
.Here's an example demonstrating some of these advanced features:
new_df <- df %>%
group_by(category) %>%
mutate(total_sales = sum(sales),
sales_percentage = sales / total_sales,
discount_price = ifelse(sales_percentage > 0.5, price * 0.9, price))
In this example, we calculate the total sales for each category, the sales percentage for each row, and apply a discount to the price based on the sales percentage.
Conclusion:
The r mutate()
function in dplyr
package is a powerful tool for creating new variables or modifying existing variables in a dataframe. It supports a wide range of mathematical operations, functions, logical operations, and string manipulation. Its flexibility and simplicity make it an essential function for data manipulation tasks in R
.