📅  最后修改于: 2023-12-03 15:00:02.546000             🧑  作者: Mango
When working with data in Pandas, it is common to come across missing or NaN values. NaN stands for "Not a Number" and represents the absence of a value. It is important to identify and handle NaN values properly as they can impact the accuracy of your analysis.
In this article, we will explore several methods to count the NaN values in a Pandas DataFrame.
isna()
MethodThe isna()
method in Pandas returns a boolean DataFrame, indicating which values are NaN. We can then use the sum()
method to count the number of NaN values per column or row.
Here is an example:
import pandas as pd
# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
# count the number of NaN values per column
print(df.isna().sum())
# count the number of NaN values per row
print(df.isna().sum(axis=1))
Output:
A 1
B 1
C 1
dtype: int64
0 1
1 1
2 1
3 1
dtype: int64
Notice that the first sum()
call counts the number of NaN values in each column, while the second call counts the number of NaN values in each row.
isnull()
MethodThe isnull()
method is similar to isna()
, but it is an alias for it. It returns a boolean DataFrame with True where the values are NaN.
import pandas as pd
# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
# count the number of NaN values per column
print(df.isnull().sum())
# count the number of NaN values per row
print(df.isnull().sum(axis=1))
Output:
A 1
B 1
C 1
dtype: int64
0 1
1 1
2 1
3 1
dtype: int64
notna()
MethodThe notna()
method is the opposite of isna()
. It returns a boolean DataFrame with True where the values are not NaN.
import pandas as pd
# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
# count the number of non-NaN values per column
print(df.notna().sum())
# count the number of non-NaN values per row
print(df.notna().sum(axis=1))
Output:
A 3
B 3
C 3
dtype: int64
0 2
1 2
2 2
3 2
dtype: int64
Notice that the first sum()
call counts the number of non-NaN values in each column, while the second call counts the number of non-NaN values in each row.
count()
MethodThe count()
method returns the number of non-NaN values in each column or row.
import pandas as pd
# create a DataFrame with some NaN values
df = pd.DataFrame({'A': [1, 2, np.nan, 4],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
# count the number of non-NaN values per column
print(df.count())
# count the number of non-NaN values per row
print(df.count(axis=1))
Output:
A 3
B 3
C 3
dtype: int64
0 3
1 3
2 3
3 2
dtype: int64
Notice that the second count()
call returns a different result than the previous sum()
calls, since it counts only the non-NaN values.
Counting NaN values in Pandas is an important task when working with data. In this article, we explored several methods to count the NaN values in a Pandas DataFrame, including isna()
, isnull()
, notna()
, and count()
. Each of these methods has its own advantages and can be used depending on the specific needs of your analysis.