note.nkmk.me

NumPy: Extract or delete elements, rows and columns that satisfy the conditions

Posted: 2019-05-31 / Tags: Python, NumPy

A method of extracting or deleting elements, rows and columns that satisfy the condition from the NumPy array ndarray will be described together with sample code.

  • Extract elements that satisfy the conditions
  • Extract rows and columns that satisfy the conditions
    • All elements satisfy the condition: numpy.all()
    • At least one element satisfies the condition: numpy.any()
  • Delete elements, rows and columns that satisfy the conditions
    • Use ~ (NOT)
    • Use numpy.delete() and numpy.where()
  • Multiple conditions

If you want to replace or count an element that satisfies the conditions, see the following article.

Sponsored Link

Extract elements that satisfy the conditions

If you want to extract elements that meet the condition, you can use ndarray[conditional expression].

Even if the original ndarray is a multidimensional array, a flattened one-dimensional array is returned.

import numpy as np

a = np.arange(12).reshape((3, 4))
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(a[a < 5])
# [0 1 2 3 4]

print(a < 10)
# [[ True  True  True  True]
#  [ True  True  True  True]
#  [ True  True False False]]

print(a[a < 10])
# [0 1 2 3 4 5 6 7 8 9]

A new ndarray is returned and the originalndarray is unchanged. The same is true for the following examples.

b = a[a < 10]
print(b)
# [0 1 2 3 4 5 6 7 8 9]

print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

It is possible to calculate the sum, average, maximum value, minimum value, standard deviation, etc. of elements that satisfy the condition.

print(a[a < 5].sum())
# 10

print(a[a < 5].mean())
# 2.0

print(a[a < 5].max())
# 4

print(a[a < 10].min())
# 0

print(a[a < 10].std())
# 2.8722813232690143

Extract rows and columns that satisfy the conditions

In the example of extracting elements, a one-dimensional array is returned, but if you use np.all() and np.any(), you can extract rows and columns while keeping the original ndarray dimension.

All elements satisfy the condition: numpy.all()

np.all() is a function that returns True when all elements of ndarray passed to the first parameter are True, and returns False otherwise.

If you specify the parameter axis, it returns True if all elements are True for each axis. In the case of a two-dimensional array, the result is for columns when axis=0 and for rows when axis=1.

print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(np.all(a < 5))
# False

print(np.all(a < 5, axis=0))
# [False False False False]

print(np.all(a < 5, axis=1))
# [ True False False]

print(a < 10)
# [[ True  True  True  True]
#  [ True  True  True  True]
#  [ True  True False False]]

print(np.all(a < 10, axis=0))
# [ True  True False False]

print(np.all(a < 10, axis=1))
# [ True  True False]

Rows and columns are extracted by giving each result to [rows, :] or [:, columns]. For [rows, :], the trailing , : can be omitted.

print(a[:, np.all(a < 10, axis=0)])
# [[0 1]
#  [4 5]
#  [8 9]]

print(a[np.all(a < 10, axis=1), :])
# [[0 1 2 3]
#  [4 5 6 7]]

print(a[np.all(a < 10, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]

If the condition is not met, an empty ndarray is returned.

print(a[:, np.all(a < 5, axis=0)])
# []

Even if only one row or one column is extracted, the number of dimensions does not change.

print(a[np.all(a < 5, axis=1)])
# [[0 1 2 3]]

print(a[np.all(a < 5, axis=1)].ndim)
# 2

print(a[np.all(a < 5, axis=1)].shape)
# (1, 4)

At least one element satisfies the condition: numpy.any()

np.any() is a function that returns True when ndarray passed to the first parameter conttains at least one True element, and returns False otherwise.

If you specify the parameter axis, it returns True if at least one element is True for each axis. In the case of a two-dimensional array, the result is for columns when axis=0 and for rows when axis=1.

print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(np.any(a < 5))
# True

print(np.any(a < 5, axis=0))
# [ True  True  True  True]

print(np.any(a < 5, axis=1))
# [ True  True False]

You can extract rows and columns that match the conditions in the same way as np.all().

print(a[:, np.any(a < 5, axis=0)])
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a[np.any(a < 5, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]
Sponsored Link

Delete elements, rows and columns that satisfy the conditions

If you want to delete elements, rows, or columns instead of extracting them depending on conditions, there are the following two methods.

Use ~ (NOT)

If you add the negation operator ~ to a condition, elements, rows and columns that do not satisfy the condition are extracted. This is equivalent to deleting an element, row or column that satisfies the condition.

print(a[~(a < 5)])
# [ 5  6  7  8  9 10 11]

print(a[:, np.all(a < 10, axis=0)])
# [[0 1]
#  [4 5]
#  [8 9]]

print(a[:, ~np.all(a < 10, axis=0)])
# [[ 2  3]
#  [ 6  7]
#  [10 11]]

print(a[np.any(a < 5, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]

print(a[~np.any(a < 5, axis=1)])
# [[ 8  9 10 11]]

Use numpy.delete() and numpy.where()

Rows and columns can also be deleted using np.delete() and np.where().

In np.delete(), set the target ndarray, the index to delete and the target axis.

In the case of a two-dimensional array, rows are deleted if axis=0 and columns are deleted if axis=1.

print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.delete(a, [0, 2], axis=0))
# [[4 5 6 7]]

print(np.delete(a, [0, 2], axis=1))
# [[ 1  3]
#  [ 5  7]
#  [ 9 11]]

See also the following post for np.delete().

np.where() returns the index of the element that satisfies the condition.

In the case of a multidimensional array, a tuple of a list of indices (row number, column number) that satisfy the condition for each dimension (row, column) is returned.

print(a < 2)
# [[ True  True False False]
#  [False False False False]
#  [False False False False]]

print(np.where(a < 2))
# (array([0, 0]), array([0, 1]))

print(np.where(a < 2)[0])
# [0 0]

print(np.where(a < 2)[1])
# [0 1]

See also the following post for np.where().

By combining these two functions, you can delete the rows and columns that satisfy the condition.

print(np.delete(a, np.where(a < 2)[0], axis=0))
# [[ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.delete(a, np.where(a < 2)[1], axis=1))
# [[ 2  3]
#  [ 6  7]
#  [10 11]]

print(a == 6)
# [[False False False False]
#  [False False  True False]
#  [False False False False]]

print(np.where(a == 6))
# (array([1]), array([2]))

print(np.delete(a, np.where(a == 6)))
# [ 0  3  4  5  6  7  8  9 10 11]

print(np.delete(a, np.where(a == 6)[0], axis=0))
# [[ 0  1  2  3]
#  [ 8  9 10 11]]

print(np.delete(a, np.where(a == 6)[1], axis=1))
# [[ 0  1  3]
#  [ 4  5  7]
#  [ 8  9 11]]

As in the example above, the rows and columns that have at least one element satisfying the condition are deleted. This is the same as using np.any().

Multiple conditions

If you want to combine multiple conditions, enclose each conditional expression with () and use & or |.

print(a[(a < 10) & (a % 2 == 1)])
# [1 3 5 7 9]

print(a[np.any((a == 2) | (a == 10), axis=1)])
# [[ 0  1  2  3]
#  [ 8  9 10 11]]

print(a[:, ~np.any((a == 2) | (a == 10), axis=0)])
# [[ 0  1  3]
#  [ 4  5  7]
#  [ 8  9 11]]
Sponsored Link
Share

Related Categories

Related Posts