NumPy: Functions ignoring NaN (np.nansum, np.nanmean, etc.)
In NumPy, functions like np.sum()
and np.mean()
return NaN
if the array (ndarray
) contains any NaN
values. To perform calculations that ignore NaN
, use functions such as np.nansum()
and np.nanmean()
.
For basics on handling NaN
in Python, refer to the following article.
To replace or remove NaN
in ndarray
, see the following articles.
- NumPy: Replace NaN (np.nan) using np.nan_to_num() and np.isnan()
- NumPy: Remove NaN (np.nan) from an array
The NumPy version used in this article is as follows. Note that functionality may vary between versions. For example, consider reading the following CSV file, which contains missing data, using np.genfromtxt()
.
import numpy as np
print(np.__version__)
# 1.26.1
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
# [21. nan nan 24.]
# [31. 32. 33. 34.]]
Calculate the sum ignoring NaN
: np.nansum()
If the ndarray
contains NaN
, calculating the sum using the np.sum()
function or the sum()
method of ndarray
returns NaN
.
print(np.sum(a))
# nan
print(a.sum())
# nan
To calculate the sum ignoring NaN
, use the np.nansum()
function.
print(np.nansum(a))
# 212.0
Similar to np.sum()
, setting the axis
argument allows calculation of sums by row or column. The keepdims
argument can also be specified.
- NumPy: Sum, mean, max, min for entire array, column/row-wise
- NumPy: Meaning of the axis parameter (0, 1, -1)
print(np.nansum(a, axis=0))
# [63. 44. 33. 72.]
print(np.nansum(a, axis=1))
# [ 37. 45. 130.]
There is no nansum()
method for ndarray
.
Functions ignoring NaN
: np.nanmean()
, np.nanmax()
, np.nanmin()
, etc,
For functions like np.mean()
, np.max()
, and np.min()
, there are alternatives that ignore NaN
. These include np.nanmean()
, np.nanmax()
, and np.nanmin()
, among others.
- numpy.nanmean — NumPy v1.26 Manual
- numpy.nanmax — NumPy v1.26 Manual
- numpy.nanmin — NumPy v1.26 Manual
- numpy.nanstd — NumPy v1.26 Manual
- numpy.nanvar — NumPy v1.26 Manual
print(np.nanmean(a))
# 23.555555555555557
print(np.nanmax(a))
# 34.0
print(np.nanmin(a))
# 11.0
print(np.nanstd(a))
# 8.908312112367753
print(np.nanvar(a))
# 79.35802469135803
All these functions allow specifying arguments such as axis
or keepdims
.