📅  最后修改于: 2023-12-03 15:18:13.749000             🧑  作者: Mango
When working with data in pandas, it's often necessary to identify and handle missing values. In this tutorial, we'll explore how to select rows from a DataFrame that contain missing values.
df.where()
FunctionOne way to select rows that contain missing values is to use the df.where()
function. This function returns a DataFrame where all the rows have been replaced with NaN values if a given condition is not true. Here's what that looks like:
import pandas as pd
# create sample dataframe
df = pd.DataFrame({"A": [1, 2, None, 4], "B": [None, 6, None, 8]})
print(df)
A B
0 1.0 NaN
1 2.0 6.0
2 NaN NaN
3 4.0 8.0
To select rows in df
where any of the columns contain missing values, we can use the following code:
df_missing = df.where(pd.isna(df))
print(df_missing)
A B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
In this example, we're using the pd.isna()
function to identify NaN values in the DataFrame. The df.where()
function then returns a DataFrame with the same shape as df
, but where any rows that did not meet the condition (in this case, rows with missing values) have been replaced with NaN values.
If we only want to select rows that contain missing values, we can use the df.dropna()
function instead of df.where()
. Here's how that looks:
df_missing = df.dropna()
print(df_missing)
A B
1 2.0 6.0
3 4.0 8.0
In this example, the df.dropna()
function removes any rows that contain missing values in any of their columns.
In this tutorial, we explored how to select rows that contain missing values in a pandas DataFrame. We used the df.where()
and df.dropna()
functions to filter our data and return only the rows that met our criteria.