NumPy: Remove NaN (np.nan) from an array

Modified: 2024-01-23 | Tags: Python, NumPy

In NumPy, to remove rows or columns containing NaN (np.nan) from an array (ndarray), use np.isnan() to identify NaN and methods like any() or all() to extract rows or columns that do not contain NaN.

Additionally, you can remove all NaN values from an array, but this will flatten the array.

Contents

Remove all NaN from an array
Remove rows containing NaN
Remove columns containing NaN

For basics on handling NaN in Python, refer to the following article.

What is nan in Python (float('nan'), math.nan, np.nan)

For replacing NaN with other values instead of removing them, refer to the following article.

NumPy: Replace NaN (np.nan) using np.nan_to_num() and np.isnan()

The NumPy version used in this article is as follows. Note that functionality may vary between versions. For example, consider reading the following CSV file, which contains missing data, using np.genfromtxt().

import numpy as np

print(np.__version__)
# 1.26.1

a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
#  [21. nan nan 24.]
#  [31. 32. 33. 34.]]

source: numpy_nan_remove.py

Remove all `NaN` from an array

You can use np.isnan() to check if values in an ndarray are NaN.

numpy.isnan — NumPy v1.26 Manual

a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
#  [21. nan nan 24.]
#  [31. 32. 33. 34.]]

print(np.isnan(a))
# [[False False  True False]
#  [False  True  True False]
#  [False False False False]]

source: numpy_nan_remove.py

Applying the negation operator (~) to this resulting ndarray turns NaN to False, which can be used as a mask to remove NaN (extract non-NaN values). Since the number of remaining elements changes, the resulting ndarray does not retain the same shape as the original ndarray, but instead becomes flattened (converted to one-dimensional).

print(~np.isnan(a))
# [[ True  True False  True]
#  [ True False False  True]
#  [ True  True  True  True]]

print(a[~np.isnan(a)])
# [11. 12. 14. 21. 24. 31. 32. 33. 34.]

source: numpy_nan_remove.py

Remove rows containing `NaN`

To remove rows containing NaN, call the any() method on the ndarray generated by np.isnan(). The any() method returns True if there is at least one True in the ndarray.

numpy.ndarray.any — NumPy v1.26 Manual

By setting axis=1 in any(), it checks whether there is at least one True in each row, indicating the presence of NaN.

NumPy: Meaning of the axis parameter (0, 1, -1)

a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
#  [21. nan nan 24.]
#  [31. 32. 33. 34.]]

print(np.isnan(a))
# [[False False  True False]
#  [False  True  True False]
#  [False False False False]]

print(np.isnan(a).any(axis=1))
# [ True  True False]

source: numpy_nan_remove.py

Using the negation operator (~) to swap True and False, rows without any NaN become True.

print(~np.isnan(a).any(axis=1))
# [False False  True]

source: numpy_nan_remove.py

By applying this ndarray to the rows (the first dimension) of the original ndarray, you can remove rows with NaN (extract rows without NaN).

NumPy: Get and set values in an array using various indexing

print(a[~np.isnan(a).any(axis=1), :])
# [[31. 32. 33. 34.]]

source: numpy_nan_remove.py

You can omit the column specification (:) as shown below.

print(a[~np.isnan(a).any(axis=1)])
# [[31. 32. 33. 34.]]

source: numpy_nan_remove.py

To remove only rows where all elements are NaN, use all() instead of any().

numpy.ndarray.all — NumPy v1.26 Manual

Setting axis=1 checks if all elements in each row are True. Here, np.nan is assigned to elements for explanation.

a[1, 0] = np.nan
a[1, 3] = np.nan
print(a)
# [[11. 12. nan 14.]
#  [nan nan nan nan]
#  [31. 32. 33. 34.]]

print(np.isnan(a).all(axis=1))
# [False  True False]

print(~np.isnan(a).all(axis=1))
# [ True False  True]

print(a[~np.isnan(a).all(axis=1)])
# [[11. 12. nan 14.]
#  [31. 32. 33. 34.]]

source: numpy_nan_remove.py

Remove columns containing `NaN`

The process to remove columns containing NaN is similar to that used for rows.

Using any() with axis=0 checks if there is at least one True in each column, indicating the presence of NaN. Apply the negation operator (~) to convert columns without any NaN to True.

a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
#  [21. nan nan 24.]
#  [31. 32. 33. 34.]]

print(np.isnan(a))
# [[False False  True False]
#  [False  True  True False]
#  [False False False False]]

print(np.isnan(a).any(axis=0))
# [False  True  True False]

print(~np.isnan(a).any(axis=0))
# [ True False False  True]

source: numpy_nan_remove.py

By applying this ndarray to the columns (the second dimension) of the original ndarray, you can remove columns with NaN (extract columns without NaN).

print(a[:, ~np.isnan(a).any(axis=0)])
# [[11. 14.]
#  [21. 24.]
#  [31. 34.]]

source: numpy_nan_remove.py

To remove only columns where all elements are NaN, use all() instead of any().

a[2, 2] = np.nan
print(a)
# [[11. 12. nan 14.]
#  [21. nan nan 24.]
#  [31. 32. nan 34.]]

print(np.isnan(a).all(axis=0))
# [False False  True False]

print(~np.isnan(a).all(axis=0))
# [ True  True False  True]

print(a[:, ~np.isnan(a).all(axis=0)])
# [[11. 12. 14.]
#  [21. nan 24.]
#  [31. 32. 34.]]

source: numpy_nan_remove.py

NumPy: Remove NaN (np.nan) from an array

Remove all `NaN` from an array

Remove rows containing `NaN`

Remove columns containing `NaN`

Related Categories

Related Articles

NumPy: Remove NaN (np.nan) from an array

Remove all NaN from an array

Remove rows containing NaN

Remove columns containing NaN

Related Categories

Related Articles

Remove all `NaN` from an array

Remove rows containing `NaN`

Remove columns containing `NaN`