# NumPy: Extract or delete elements, rows and columns that satisfy the conditions

Posted: 2019-05-31 / Tags: Python, NumPy

A method of extracting or deleting elements, rows and columns that satisfy the condition from the NumPy array `ndarray` will be described together with sample code.

• Extract elements that satisfy the conditions
• Extract rows and columns that satisfy the conditions
• All elements satisfy the condition: `numpy.all()`
• At least one element satisfies the condition: `numpy.any()`
• Delete elements, rows and columns that satisfy the conditions
• Use `~` (NOT)
• Use `numpy.delete()` and `numpy.where()`
• Multiple conditions

If you want to replace or count an element that satisfies the conditions, see the following article.

## Extract elements that satisfy the conditions

If you want to extract elements that meet the condition, you can use `ndarray[conditional expression]`.

Even if the original `ndarray` is a multidimensional array, a flattened one-dimensional array is returned.

```import numpy as np

a = np.arange(12).reshape((3, 4))
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(a[a < 5])
# [0 1 2 3 4]

print(a < 10)
# [[ True  True  True  True]
#  [ True  True  True  True]
#  [ True  True False False]]

print(a[a < 10])
# [0 1 2 3 4 5 6 7 8 9]
```

A new `ndarray` is returned and the original`ndarray` is unchanged. The same is true for the following examples.

```b = a[a < 10]
print(b)
# [0 1 2 3 4 5 6 7 8 9]

print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]
```

It is possible to calculate the sum, average, maximum value, minimum value, standard deviation, etc. of elements that satisfy the condition.

```print(a[a < 5].sum())
# 10

print(a[a < 5].mean())
# 2.0

print(a[a < 5].max())
# 4

print(a[a < 10].min())
# 0

print(a[a < 10].std())
# 2.8722813232690143
```

## Extract rows and columns that satisfy the conditions

In the example of extracting elements, a one-dimensional array is returned, but if you use `np.all()` and `np.any()`, you can extract rows and columns while keeping the original `ndarray` dimension.

### All elements satisfy the condition: numpy.all()

`np.all()` is a function that returns `True` when all elements of `ndarray` passed to the first parameter are `True`, and returns `False` otherwise.

If you specify the parameter `axis`, it returns `True` if all elements are `True` for each axis. In the case of a two-dimensional array, the result is for columns when `axis=0` and for rows when `axis=1`.

```print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(np.all(a < 5))
# False

print(np.all(a < 5, axis=0))
# [False False False False]

print(np.all(a < 5, axis=1))
# [ True False False]

print(a < 10)
# [[ True  True  True  True]
#  [ True  True  True  True]
#  [ True  True False False]]

print(np.all(a < 10, axis=0))
# [ True  True False False]

print(np.all(a < 10, axis=1))
# [ True  True False]
```

Rows and columns are extracted by giving each result to `[rows, :]` or `[:, columns]`. For `[rows, :]`, the trailing `, :` can be omitted.

```print(a[:, np.all(a < 10, axis=0)])
# [[0 1]
#  [4 5]
#  [8 9]]

print(a[np.all(a < 10, axis=1), :])
# [[0 1 2 3]
#  [4 5 6 7]]

print(a[np.all(a < 10, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]
```

If the condition is not met, an empty `ndarray` is returned.

```print(a[:, np.all(a < 5, axis=0)])
# []
```

Even if only one row or one column is extracted, the number of dimensions does not change.

```print(a[np.all(a < 5, axis=1)])
# [[0 1 2 3]]

print(a[np.all(a < 5, axis=1)].ndim)
# 2

print(a[np.all(a < 5, axis=1)].shape)
# (1, 4)
```

### At least one element satisfies the condition: numpy.any()

`np.any()` is a function that returns `True` when `ndarray` passed to the first parameter conttains at least one `True` element, and returns `False` otherwise.

If you specify the parameter `axis`, it returns `True` if at least one element is `True` for each axis. In the case of a two-dimensional array, the result is for columns when `axis=0` and for rows when `axis=1`.

```print(a < 5)
# [[ True  True  True  True]
#  [ True False False False]
#  [False False False False]]

print(np.any(a < 5))
# True

print(np.any(a < 5, axis=0))
# [ True  True  True  True]

print(np.any(a < 5, axis=1))
# [ True  True False]
```

You can extract rows and columns that match the conditions in the same way as `np.all()`.

```print(a[:, np.any(a < 5, axis=0)])
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a[np.any(a < 5, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]
```

## Delete elements, rows and columns that satisfy the conditions

If you want to delete elements, rows, or columns instead of extracting them depending on conditions, there are the following two methods.

### Use ~ (NOT)

If you add the negation operator `~` to a condition, elements, rows and columns that do not satisfy the condition are extracted. This is equivalent to deleting an element, row or column that satisfies the condition.

```print(a[~(a < 5)])
# [ 5  6  7  8  9 10 11]

print(a[:, np.all(a < 10, axis=0)])
# [[0 1]
#  [4 5]
#  [8 9]]

print(a[:, ~np.all(a < 10, axis=0)])
# [[ 2  3]
#  [ 6  7]
#  [10 11]]

print(a[np.any(a < 5, axis=1)])
# [[0 1 2 3]
#  [4 5 6 7]]

print(a[~np.any(a < 5, axis=1)])
# [[ 8  9 10 11]]
```

### Use numpy.delete() and numpy.where()

Rows and columns can also be deleted using `np.delete()` and `np.where()`.

In `np.delete()`, set the target `ndarray`, the index to delete and the target axis.

In the case of a two-dimensional array, rows are deleted if `axis=0` and columns are deleted if `axis=1`.

```print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.delete(a, [0, 2], axis=0))
# [[4 5 6 7]]

print(np.delete(a, [0, 2], axis=1))
# [[ 1  3]
#  [ 5  7]
#  [ 9 11]]
```

See also the following post for `np.delete()`.

`np.where()` returns the index of the element that satisfies the condition.

In the case of a multidimensional array, a tuple of a list of indices (row number, column number) that satisfy the condition for each dimension (row, column) is returned.

```print(a < 2)
# [[ True  True False False]
#  [False False False False]
#  [False False False False]]

print(np.where(a < 2))
# (array([0, 0]), array([0, 1]))

print(np.where(a < 2))
# [0 0]

print(np.where(a < 2))
# [0 1]
```

See also the following post for `np.where()`.

By combining these two functions, you can delete the rows and columns that satisfy the condition.

```print(np.delete(a, np.where(a < 2), axis=0))
# [[ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.delete(a, np.where(a < 2), axis=1))
# [[ 2  3]
#  [ 6  7]
#  [10 11]]

print(a == 6)
# [[False False False False]
#  [False False  True False]
#  [False False False False]]

print(np.where(a == 6))
# (array(), array())

print(np.delete(a, np.where(a == 6)))
# [ 0  3  4  5  6  7  8  9 10 11]

print(np.delete(a, np.where(a == 6), axis=0))
# [[ 0  1  2  3]
#  [ 8  9 10 11]]

print(np.delete(a, np.where(a == 6), axis=1))
# [[ 0  1  3]
#  [ 4  5  7]
#  [ 8  9 11]]
```

As in the example above, the rows and columns that have at least one element satisfying the condition are deleted. This is the same as using `np.any()`.

## Multiple conditions

If you want to combine multiple conditions, enclose each conditional expression with `()` and use `&` or `|`.

```print(a[(a < 10) & (a % 2 == 1)])
# [1 3 5 7 9]

print(a[np.any((a == 2) | (a == 10), axis=1)])
# [[ 0  1  2  3]
#  [ 8  9 10 11]]

print(a[:, ~np.any((a == 2) | (a == 10), axis=0)])
# [[ 0  1  3]
#  [ 4  5  7]
#  [ 8  9 11]]
```