pandas: Find rows/columns with NaN (missing values)

Modified: 2023-08-02 | Tags: Python, pandas

You can find rows/columns containing NaN in pandas.DataFrame using the isnull() or isna() method that checks if an element is a missing value.

Contents

Find rows/columns with NaN in specific columns/rows
Find rows/columns with at least one NaN

While this article primarily deals with NaN (Not a Number), it's important to note that in pandas, None is also treated as a missing value.

Missing values in pandas (nan, None, pd.NA)

Use the dropna() method to retain rows/columns where all elements are non-missing values, i.e., remove rows/columns containing missing values.

pandas: Remove NaN (missing values) with dropna()

The sample code in this article uses pandas version 2.0.3. As an example, read a CSV file with missing values.

sample_pandas_normal_nan.csv

import pandas as pd

print(pd.__version__)
# 2.0.3

df = pd.read_csv('data/src/sample_pandas_normal_nan.csv')
print(df)
#       name   age state  point  other
# 0    Alice  24.0    NY    NaN    NaN
# 1      NaN   NaN   NaN    NaN    NaN
# 2  Charlie   NaN    CA    NaN    NaN
# 3     Dave  68.0    TX   70.0    NaN
# 4    Ellen   NaN    CA   88.0    NaN
# 5    Frank  30.0   NaN    NaN    NaN

source: pandas_nan_extract.py

Find rows/columns with `NaN` in specific columns/rows

You can use the isnull() or isna() method of pandas.DataFrame and Series to check if each element is a missing value or not.

pandas: Detect and count NaN (missing values) with isnull(), isna()

print(df.isnull())
#     name    age  state  point  other
# 0  False  False  False   True   True
# 1   True   True   True   True   True
# 2  False   True  False   True   True
# 3  False  False  False  False   True
# 4  False   True  False  False   True
# 5  False  False   True   True   True

source: pandas_nan_extract.py

isnull() is an alias for isna(), and both are used interchangeably. isnull() is mainly used in this article, but you can replace it with isna().

If you want to find rows with NaN in a specific column, use the result of isnull() for that column.

print(df['point'].isnull())
# 0     True
# 1     True
# 2     True
# 3    False
# 4    False
# 5     True
# Name: point, dtype: bool

print(df[df['point'].isnull()])
#       name   age state  point  other
# 0    Alice  24.0    NY    NaN    NaN
# 1      NaN   NaN   NaN    NaN    NaN
# 2  Charlie   NaN    CA    NaN    NaN
# 5    Frank  30.0   NaN    NaN    NaN

source: pandas_nan_extract.py

The concept is the same when finding columns with NaN in a specific row. Use loc[] to select by name (label), and iloc[] to select by position.

pandas: Get/Set element values with at, iat, loc, iloc

print(df.iloc[2].isnull())
# name     False
# age       True
# state    False
# point     True
# other     True
# Name: 2, dtype: bool

print(df.loc[:, df.iloc[2].isnull()])
#     age  point  other
# 0  24.0    NaN    NaN
# 1   NaN    NaN    NaN
# 2   NaN    NaN    NaN
# 3  68.0   70.0    NaN
# 4   NaN   88.0    NaN
# 5  30.0    NaN    NaN

source: pandas_nan_extract.py

Find rows/columns with at least one `NaN`

To use as an example, remove rows and columns where all values are NaN.

df2 = df.dropna(how='all').dropna(how='all', axis=1)
print(df2)
#       name   age state  point
# 0    Alice  24.0    NY    NaN
# 2  Charlie   NaN    CA    NaN
# 3     Dave  68.0    TX   70.0
# 4    Ellen   NaN    CA   88.0
# 5    Frank  30.0   NaN    NaN

source: pandas_nan_extract.py

Use the any() method that returns True if there is at least one True in each row/column. By default, it is applied to columns. If axis=1, it is applied to rows.

pandas.DataFrame.any — pandas 2.0.3 documentation

By calling any() on the result of isnull(), you can check if each row and column contains at least one NaN.

Find rows that contain at least one NaN:

print(df2.isnull())
#     name    age  state  point
# 0  False  False  False   True
# 2  False   True  False   True
# 3  False  False  False  False
# 4  False   True  False  False
# 5  False  False   True   True

print(df2.isnull().any(axis=1))
# 0     True
# 2     True
# 3    False
# 4     True
# 5     True
# dtype: bool

print(df2[df2.isnull().any(axis=1)])
#       name   age state  point
# 0    Alice  24.0    NY    NaN
# 2  Charlie   NaN    CA    NaN
# 4    Ellen   NaN    CA   88.0
# 5    Frank  30.0   NaN    NaN

source: pandas_nan_extract.py

Find columns that contain at least one NaN:

print(df2.isnull().any())
# name     False
# age       True
# state     True
# point     True
# dtype: bool

print(df2.loc[:, df2.isnull().any()])
#     age state  point
# 0  24.0    NY    NaN
# 2   NaN    CA    NaN
# 3  68.0    TX   70.0
# 4   NaN    CA   88.0
# 5  30.0   NaN    NaN

source: pandas_nan_extract.py

pandas: Find rows/columns with NaN (missing values)

Find rows/columns with `NaN` in specific columns/rows

Find rows/columns with at least one `NaN`

Related Categories

Related Articles

pandas: Find rows/columns with NaN (missing values)

Find rows/columns with NaN in specific columns/rows

Find rows/columns with at least one NaN

Related Categories

Related Articles

Find rows/columns with `NaN` in specific columns/rows

Find rows/columns with at least one `NaN`