numpy.where(): Manipulate elements depending on conditions
With numpy.where
, you can replace or manipulate elements of the NumPy array ndarray
that satisfy the conditions.
This article describes the following contents.
- Overview of
np.where()
np.where()
with multiple conditions- Replace the elements that satisfy the condition
- Manipulate the elements that satisfy the condition
- Get the indices of the elements that satisfy the condition
If you want to extract or delete elements, rows, and columns that satisfy the conditions, see the following article.
Overview of np.where()
numpy.where(condition[, x, y])
Return elements, either from x or y, depending on condition.
If only condition is given, return condition.nonzero().
numpy.where — NumPy v1.14 Manual
np.where()
is a function that returns ndarray
which is x
if condition
is True
and y
if False
. x
, y
and condition
need to be broadcastable to same shape.
If x
andy
are omitted, index
is returned. Details are described later.
import numpy as np
a = np.arange(9).reshape((3, 3))
print(a)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
print(np.where(a < 4, -1, 100))
# [[ -1 -1 -1]
# [ -1 100 100]
# [100 100 100]]
You can get the boolean ndarray
by a condition including ndarray
without using np.where()
.
print(a < 4)
# [[ True True True]
# [ True False False]
# [False False False]]
np.where()
with multiple conditions
You can apply multiple conditions with np.where()
by enclosing each condition in ()
and using &
or |
.
print(np.where((a > 2) & (a < 6), -1, 100))
# [[100 100 100]
# [ -1 -1 -1]
# [100 100 100]]
print(np.where((a > 2) & (a < 6) | (a == 7), -1, 100))
# [[100 100 100]
# [ -1 -1 -1]
# [100 -1 100]]
See the following article for why you must use &
, |
instead of and
, or
and why parentheses are necessary.
Even in the case of multiple conditions, it is not necessary to use np.where()
to get the boolean ndarray
.
print((a > 2) & (a < 6))
# [[False False False]
# [ True True True]
# [False False False]]
print((a > 2) & (a < 6) | (a == 7))
# [[False False False]
# [ True True True]
# [False True False]]
Replace the elements that satisfy the condition
It is also possible to replace elements with a given value only when the condition is satisfied or not satisfied.
If you pass the original ndarray
to x
and y
, the original value is used as it is.
print(np.where(a < 4, -1, a))
# [[-1 -1 -1]
# [-1 4 5]
# [ 6 7 8]]
print(np.where(a < 4, a, 100))
# [[ 0 1 2]
# [ 3 100 100]
# [100 100 100]]
Note that np.where()
returns a new ndarray
, and the original ndarray
is unchanged.
a_org = np.arange(9).reshape((3, 3))
print(a_org)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
a_new = np.where(a_org < 4, -1, a_org)
print(a_new)
# [[-1 -1 -1]
# [-1 4 5]
# [ 6 7 8]]
print(a_org)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
If you want to update the original ndarray
itself, you can write:
a_org[a_org < 4] = -1
print(a_org)
# [[-1 -1 -1]
# [-1 4 5]
# [ 6 7 8]]
Manipulate the elements that satisfy the condition
Instead of the original ndarray
, you can also specify expressions for x
and y
.
print(np.where(a < 4, a * 10, a))
# [[ 0 10 20]
# [30 4 5]
# [ 6 7 8]]
Get the indices of the elements that satisfy the condition
If x
and y
are omitted, the indices of the elements satisfying the condition are returned.
A tuple of an array of indices (row number, column number) that satisfy the condition for each dimension (row, column) is returned.
print(np.where(a < 4))
# (array([0, 0, 0, 1]), array([0, 1, 2, 0]))
print(type(np.where(a < 4)))
# <class 'tuple'>
In this case, it means that the elements at [0, 0]
, [0, 1]
, [0, 2]
and [1, 0]
satisfy the condition.
It is also possible to obtain a list of each coordinate by using list()
, zip()
, and *
as follows:
print(list(zip(*np.where(a < 4))))
# [(0, 0), (0, 1), (0, 2), (1, 0)]
The same applies to multi-dimensional arrays of three or more dimensions.
a_3d = np.arange(24).reshape(2, 3, 4)
print(a_3d)
# [[[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
#
# [[12 13 14 15]
# [16 17 18 19]
# [20 21 22 23]]]
print(np.where(a_3d < 5))
# (array([0, 0, 0, 0, 0]), array([0, 0, 0, 0, 1]), array([0, 1, 2, 3, 0]))
print(list(zip(*np.where(a_3d < 5))))
# [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 1, 0)]
The same applies to one-dimensional arrays. Note that using list()
, zip()
, and *
, each element in the resulting list is a tuple with one element.
a_1d = np.arange(6)
print(a_1d)
# [0 1 2 3 4 5]
print(np.where(a_1d < 3))
# (array([0, 1, 2]),)
print(list(zip(*np.where(a_1d < 3))))
# [(0,), (1,), (2,)]
If you know it is one-dimensional, you can use the first element of the result of np.where()
as it is. In this case, it will be a ndarray
with an integer int
as an element, not a tuple with one element. If you want to convert to a list, use tolist()
.
print(np.where(a_1d < 3)[0])
# [0 1 2]
print(np.where(a_1d < 3)[0].tolist())
# [0, 1, 2]
You can get the number of dimensions with the ndim
attribute.