Python|熊猫 dataframe.replace()
Python是一种用于进行数据分析的出色语言,主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一,它使导入和分析数据变得更加容易。
Pandas dataframe.replace()
函数用于从数据框中替换字符串、正则表达式、列表、字典、系列、数字等。这是一个非常丰富的函数,因为它有很多变化。
这个函数最强大的地方在于它可以与Python regex(正则表达式)一起工作。
Syntax: DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method=’pad’, axis=None)
Parameters:
to_replace : [str, regex, list, dict, Series, numeric, or None] pattern that we are trying to replace in dataframe.
value : Value to use to fill holes (e.g. 0), alternately a dict of values specifying which value to use for each column (columns not in the dict will not be filled). Regular expressions, strings and lists or dicts of such objects are also allowed.
inplace : If True, in place. Note: this will modify any other views on this object (e.g. a column from a DataFrame). Returns the caller if this is True.
limit : Maximum size gap to forward or backward fill
regex : Whether to interpret to_replace and/or value as regular expressions. If this is True then to_replace must be a string. Otherwise, to_replace must be None because this parameter will be interpreted as a regular expression or a list, dict, or array of regular expressions.
method : Method to use when for replacement, when to_replace is a list.
Returns: filled : NDFrame
有关代码中使用的 CSV 文件的链接,请单击此处
示例 #1:在 nba.csv 文件中将球队“波士顿凯尔特人队”替换为“欧米茄勇士队”
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# Printing the first 10 rows of the data frame for visualization
df[:10]
输出:
我们将在“df”数据框中用“Omega Warrior”替换“波士顿凯尔特人队”
# this will replace "Boston Celtics" with "Omega Warrior"
df.replace(to_replace ="Boston Celtics",
value ="Omega Warrior")
输出:
示例 #2:一次替换多个值。使用Python列表作为参数
我们将在“df”数据框中用“Omega Warrior”替换“波士顿凯尔特人队”和“德克萨斯”队。
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# this will replace "Boston Celtics" and "Texas" with "Omega Warrior"
df.replace(to_replace =["Boston Celtics", "Texas"],
value ="Omega Warrior")
输出:
注意第一行的 College 列,“Texas”被替换为“Omega Warriors”示例 #3:将数据框中的 Nan 值替换为 -99999 值。
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# will replace Nan value in dataframe with value -99999
df.replace(to_replace = np.nan, value =-99999)
输出:
请注意,数据框中的所有Nan
值都已替换为 -99999。尽管出于实际目的,我们应该小心我们正在替换nan
值的值。