NumPy: Views and copies of arrays

Modified: 2024-02-04 | Tags: Python, NumPy

This article explains views and copies of NumPy arrays (ndarray).

To create a copy of an ndarray, use the copy() method. To determine whether an ndarray is a view, check its base attribute. To determine whether two arrays share memory, use the np.shares_memory() or np.may_share_memory() function.

Contents

Views and copies of NumPy arrays
- Example of creating a view
- Example of creating a copy
Create a copy of an ndarray: copy()
Check if an ndarray is a view: base
Check if memory is shared: np.shares_memory()
- Basic usage
- np.may_share_memory()

For views and copies in pandas, see the following article.

pandas: Views and copies in DataFrame

The NumPy version used in this article is as follows. Note that functionality may vary between versions.

import numpy as np

print(np.__version__)
# 1.26.1

source: numpy_select_view_copy.py

Views and copies of NumPy arrays

There are two types of ndarray: views and copies.

Copies and views — NumPy v1.26 Manual

When generating one ndarray from another, an ndarray that shares memory with the original is called a view, while an ndarray that allocates new memory, separate from the original, is called a copy.

Example of creating a view

For example, slices create views.

NumPy: Slicing ndarray

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

a_slice = a[:, :2]
print(a_slice)
# [[0 1]
#  [3 4]]

source: numpy_select_view_copy.py

Since the view shares the same memory with the original array, changing the value in one object affects the value in the other.

a_slice[0, 0] = 100
print(a_slice)
# [[100   1]
#  [  3   4]]

print(a)
# [[100   1   2]
#  [  3   4   5]]

a[0, 0] = 0
print(a)
# [[0 1 2]
#  [3 4 5]]

print(a_slice)
# [[0 1]
#  [3 4]]

source: numpy_select_view_copy.py

In addition to slices, functions and methods such as reshape() also return views.

Example of creating a copy

Boolean indexing or fancy indexing creates copies.

NumPy: Get and set values in an array using various indexing

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

a_boolean_index = a[:, [True, False, True]]
print(a_boolean_index)
# [[0 2]
#  [3 5]]

source: numpy_select_view_copy.py

Since they do not share memory, changing the value in one object does not affect the value in the other.

a_boolean_index[0, 0] = 100
print(a_boolean_index)
# [[100   2]
#  [  3   5]]

print(a)
# [[0 1 2]
#  [3 4 5]]

source: numpy_select_view_copy.py

Create a copy of an `ndarray`: `copy()`

To create a copy of an ndarray, use the copy() method. It is also possible to create a copy from a view.

numpy.ndarray.copy — NumPy v1.26 Manual

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

a_slice_copy = a[:, :2].copy()
print(a_slice_copy)
# [[0 1]
#  [3 4]]

source: numpy_select_view_copy.py

For example, to process a sub-array selected by a slice separately from the original array, you can use copy().

a_slice_copy[0, 0] = 100
print(a_slice_copy)
# [[100   1]
#  [  3   4]]

print(a)
# [[0 1 2]
#  [3 4 5]]

source: numpy_select_view_copy.py

Note that there is also the view() method, but this is only for generating a view of the calling object.

numpy.ndarray.view — NumPy v1.26 Manual

Executing view() on an object created with boolean indexing or fancy indexing generates a view of that copy, not of the original object.

a_boolean_index_view = a[:, [True, False, True]].view()
print(a_boolean_index_view)
# [[0 2]
#  [3 5]]

a_boolean_index_view[0, 0] = 100
print(a_boolean_index_view)
# [[100   2]
#  [  3   5]]

print(a)
# [[0 1 2]
#  [3 4 5]]

source: numpy_select_view_copy.py

Check if an `ndarray` is a view: `base`

To determine whether an ndarray is a view, check its base attribute.

numpy.ndarray.base — NumPy v1.26 Manual

If the ndarray is a view, the base attribute points to the original ndarray.

Consider slices and reshape() as examples. reshape() returns a view whenever possible.

NumPy: reshape() to change the shape of an array

a = np.arange(10)
print(a)
# [0 1 2 3 4 5 6 7 8 9]

a_0 = a[:6]
print(a_0)
# [0 1 2 3 4 5]

print(a_0.base)
# [0 1 2 3 4 5 6 7 8 9]

a_1 = a_0.reshape(2, 3)
print(a_1)
# [[0 1 2]
#  [3 4 5]]

print(a_1.base)
# [0 1 2 3 4 5 6 7 8 9]

source: numpy_ndarray_base.py

Newly created arrays or copies have None as their base attribute.

a = np.arange(10)
print(a)
# [0 1 2 3 4 5 6 7 8 9]

print(a.base)
# None

a_copy = a.copy()
print(a_copy)
# [0 1 2 3 4 5 6 7 8 9]

print(a_copy.base)
# None

source: numpy_ndarray_base.py

If the base attribute is not None, the array can be identified as a view. Use the is operator to compare it with None.

None in Python

print(a_0.base is None)
# False

print(a_copy.base is None)
# True

print(a.base is None)
# True

source: numpy_ndarray_base.py

By comparing the base attribute with the original ndarray or the base of another view, you can also verify that memory is shared.

print(a_0.base is a)
# True

print(a_0.base is a_1.base)
# True

source: numpy_ndarray_base.py

It is more convenient to determine whether memory is shared by using np.shares_memory(), which is explained next.

Check if memory is shared: `np.shares_memory()`

The np.shares_memory() function determines if two arrays share memory.

numpy.shares_memory — NumPy v1.26 Manual

Basic usage

np.shares_memory() returns True if two specified arrays share memory.

a = np.arange(6)
print(a)
# [0 1 2 3 4 5]

a_reshape = a.reshape(2, 3)
print(a_reshape)
# [[0 1 2]
#  [3 4 5]]

print(np.shares_memory(a, a_reshape))
# True

source: numpy_shares_memory.py

It also returns True for views generated from the same ndarray.

a_slice = a[2:5]
print(a_slice)
# [2 3 4]

print(np.shares_memory(a_reshape, a_slice))
# True

source: numpy_shares_memory.py

In the case of copies, False is returned.

a_reshape_copy = a.reshape(2, 3).copy()
print(a_reshape_copy)
# [[0 1 2]
#  [3 4 5]]

print(np.shares_memory(a, a_reshape_copy))
# False

source: numpy_shares_memory.py

`np.may_share_memory()`

There is also the np.may_share_memory() function.

As "may" in the function name suggests, np.may_share_memory() is not as strict as np.shares_memory().

np.may_share_memory() determines only if memory addresses overlap, not whether elements actually reference the same memory.

For example, in the following case, two slices are views of the same ndarray and reference an overlapped range, but each element itself references separate memory.

a = np.arange(10)
print(a)
# [0 1 2 3 4 5 6 7 8 9]

a_0 = a[::2]
print(a_0)
# [0 2 4 6 8]

a_1 = a[1::2]
print(a_1)
# [1 3 5 7 9]

source: numpy_shares_memory.py

np.shares_memory() returns False because it determines more strictly, but np.may_share_memory() returns True.

print(np.shares_memory(a_0, a_1))
# False

print(np.may_share_memory(a_0, a_1))
# True

source: numpy_shares_memory.py

In the following example, since the two slices do not overlap in the range of the original ndarray, np.may_share_memory() also returns False.

a_2 = a[:5]
print(a_2)
# [0 1 2 3 4]

a_3 = a[5:]
print(a_3)
# [5 6 7 8 9]

print(np.shares_memory(a_2, a_3))
# False

print(np.may_share_memory(a_2, a_3))
# False

source: numpy_shares_memory.py

np.shares_memory() requires more processing time due to its strict analysis. The following code uses the Jupyter Notebook magic command %%timeit, and note that it will not be measured if executed as a Python script.

Measure execution time with timeit in Python

%%timeit
np.shares_memory(a_0, a_1)
# 200 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%%timeit
np.may_share_memory(a_0, a_1)
# 123 ns ± 0.284 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

source: numpy_shares_memory.py

Although the difference is insignificant in the example above, np.shares_memory() is warned to be exponentially slow for some inputs.

Warning
This function can be exponentially slow for some inputs, unless max_work is set to a finite number or MAY_SHARE_BOUNDS. If in doubt, use numpy.may_share_memory instead. numpy.shares_memory — NumPy v1.26 Manual

np.may_share_memory() might return True erroneously when elements do not actually share memory. However, it will never mistakenly return False when the memory is indeed shared. Thus, if you just need to check if memory could potentially be shared, np.may_share_memory() is an appropriate choice.

NumPy: Views and copies of arrays

Views and copies of NumPy arrays

Example of creating a view

Example of creating a copy

Create a copy of an `ndarray`: `copy()`

Check if an `ndarray` is a view: `base`

Check if memory is shared: `np.shares_memory()`

Basic usage

`np.may_share_memory()`

Related Categories

Related Articles

NumPy: Views and copies of arrays

Views and copies of NumPy arrays

Example of creating a view

Example of creating a copy

Create a copy of an ndarray: copy()

Check if an ndarray is a view: base

Check if memory is shared: np.shares_memory()

Basic usage

np.may_share_memory()

Related Categories

Related Articles

Create a copy of an `ndarray`: `copy()`

Check if an `ndarray` is a view: `base`

Check if memory is shared: `np.shares_memory()`

`np.may_share_memory()`