NumPy: Flatten an array with ravel() and flatten()

Modified: | Tags: Python, NumPy

In NumPy, to flatten an array (ndarray), use the np.ravel() function, or the ravel() and flatten() methods of ndarray.

For flattening a multi-dimensional list (Python's built-in list type), refer to the following article.

The NumPy version used in this article is as follows. Note that functionality may vary between versions.

import numpy as np

print(np.__version__)
# 1.26.1

Flatten a NumPy array with the np.ravel() function

Specifying an ndarray as the first argument to np.ravel() returns a flattened ndarray.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.ravel(a))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(type(np.ravel(a)))
# <class 'numpy.ndarray'>

The argument can be any array-like object, including Python's built-in list type. The return value is always an ndarray.

print(np.ravel([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(type(np.ravel([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]])))
# <class 'numpy.ndarray'>

Starting with NumPy version 1.24, flattening two-dimensional lists with varying inner list lengths throws an error, whereas earlier versions returned an ndarray of lists.

# print(np.ravel([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9]]))
# ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

To flatten such a list, see the following article.

Flatten a NumPy array with the ndarray.ravel() method

ravel() is also provided as a method of ndarray.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a.ravel())
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

Flatten a NumPy array with the ndarray.flatten() method

flatten() is another method available for ndarray.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a.flatten())
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

While ravel() returns a view whenever possible, flatten() always returns a copy. Since flatten() allocates new memory, it is slower than ravel(). More details are provided below.

As of version 1.26, flatten() is only available as a method of ndarray, and there is no function like np.flatten().

Flatten a NumPy array with reshape(-1)

You can also flatten a NumPy array using the reshape() method or function. Applying reshape(-1) automatically calculates the size needed for flattening.

reshape() is provided as a method of ndarray.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a.reshape(-1))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

The np.reshape() function is also provided. np.reshape() can handle array-like objects such as lists as well as np.ravel().

print(np.reshape(a, -1))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(np.reshape([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], -1))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

Difference between ravel() and flatten()

ravel() and reshape() return a view whenever possible, while flatten() always returns a copy.

For more details on views and copies in NumPy, refer to the following article.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.shares_memory(a, a.ravel()))
# True

print(np.shares_memory(a, np.ravel(a)))
# True

print(np.shares_memory(a, a.flatten()))
# False

print(np.shares_memory(a, a.reshape(-1)))
# True

print(np.shares_memory(a, np.reshape(a, -1)))
# True

In the case of a view, the original ndarray shares memory with the view, so changing the value of one affects the other.

a_ravel = a.ravel()
print(a_ravel)
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

a_ravel[0] = 100
print(a_ravel)
# [100   1   2   3   4   5   6   7   8   9  10  11]

print(a)
# [[100   1   2   3]
#  [  4   5   6   7]
#  [  8   9  10  11]]

In the case of a copy, each has its own allocated memory, so they are processed separately.

ravel() and reshape() return views whenever possible, but in some cases, they return copies. For example, when flattening slices with steps that lead to irregular memory strides a copy is returned instead of a view.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a[:, ::3])
# [[ 0  3]
#  [ 4  7]
#  [ 8 11]]

print(np.shares_memory(a[:, ::3], np.ravel(a[:, ::3])))
# False

print(np.shares_memory(a[:, ::3], np.reshape(a[:, ::3], -1)))
# False

Differences exist between ravel() and reshape(). Even with slices with steps, if the memory stride is constant, ravel() returns a copy, while reshape() returns a view.

print(a[:, ::2])
# [[ 0  2]
#  [ 4  6]
#  [ 8 10]]

print(np.shares_memory(a[:, ::2], np.ravel(a[:, ::2])))
# False

print(np.shares_memory(a[:, ::2], np.reshape(a[:, ::2], -1)))
# True

As noted in the official documentation, when a view is desired in as many cases as possible, reshape(-1) may be preferable.

When a view is desired in as many cases as possible, arr.reshape(-1) may be preferable.
numpy.ravel — NumPy v1.26 Manual

Speed comparison between ravel() and flatten().

Since flatten() allocates new memory, it is slower than ravel().

The following examples use the Jupyter Notebook magic command %%timeit. Note that these will not work if run as Python scripts.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

%%timeit
a.ravel()
# 43.6 ns ± 0.298 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%%timeit
a.flatten()
# 249 ns ± 0.971 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%%timeit
a.reshape(-1)
# 80.2 ns ± 0.145 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

For small ndarray sizes like in the above example, the difference may not be significant, but for larger sizes, flatten() can be significantly slower. Memory usage is also higher for flatten().

If a view is sufficient, using ravel() is recommended.

a_large = np.arange(1000000).reshape(100, 100, 100)

%%timeit
a_large.ravel()
# 43.6 ns ± 0.118 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%%timeit
a_large.flatten()
# 423 µs ± 25.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit
a_large.reshape(-1)
# 80 ns ± 0.0587 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Note that ravel() also allocates new memory when it returns a copy, so its speed and memory usage are the same as flatten().

The order argument

The order argument can be specified for ravel(), flatten(), and reshape().

The default is order='C', which flattens in C-like row-major order, but specifying order='F' results in Fortran-like column-major order.

a = np.arange(12).reshape(3, 4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(a.ravel())
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(a.ravel('F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(np.ravel(a, 'F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(a.flatten('F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(a.reshape(-1, order='F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(np.reshape(a, -1, order='F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

order can be 'C', 'F', 'A' for reshape(), and 'C', 'F', 'A', 'K' for ravel() and flatten().

See the official documentation for details on each.

Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used.
numpy.ravel — NumPy v1.26 Manual

The differences by order are shown below, along with ndarray information available through np.info().

For example, if fortran is True, the results of 'A' and 'F' are equal, and if fortran is False, the results of 'A' and 'C' are equal.

np.info(a)
# class:  ndarray
# shape:  (3, 4)
# strides:  (32, 8)
# itemsize:  8
# aligned:  True
# contiguous:  True
# fortran:  False
# data pointer: 0x6000004bc000
# byteorder:  little
# byteswap:  False
# type: int64

print(a.ravel('C'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(a.ravel('F'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(a.ravel('A'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(a.ravel('K'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

When transposed with the T attribute:

print(a.T)
# [[ 0  4  8]
#  [ 1  5  9]
#  [ 2  6 10]
#  [ 3  7 11]]

np.info(a.T)
# class:  ndarray
# shape:  (4, 3)
# strides:  (8, 32)
# itemsize:  8
# aligned:  True
# contiguous:  False
# fortran:  True
# data pointer: 0x6000004bc000
# byteorder:  little
# byteswap:  False
# type: int64

print(a.T.ravel('C'))
# [ 0  4  8  1  5  9  2  6 10  3  7 11]

print(a.T.ravel('F'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(a.T.ravel('A'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

print(a.T.ravel('K'))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

Slice with negative step:

print(a.T[::-1])
# [[ 3  7 11]
#  [ 2  6 10]
#  [ 1  5  9]
#  [ 0  4  8]]

np.info(a.T[::-1])
# class:  ndarray
# shape:  (4, 3)
# strides:  (-8, 32)
# itemsize:  8
# aligned:  True
# contiguous:  False
# fortran:  False
# data pointer: 0x6000004bc018
# byteorder:  little
# byteswap:  False
# type: int64

print(a.T[::-1].ravel('C'))
# [ 3  7 11  2  6 10  1  5  9  0  4  8]

print(a.T[::-1].ravel('F'))
# [ 3  2  1  0  7  6  5  4 11 10  9  8]

print(a.T[::-1].ravel('A'))
# [ 3  7 11  2  6 10  1  5  9  0  4  8]

print(a.T[::-1].ravel('K'))
# [ 3  2  1  0  7  6  5  4 11 10  9  8]

For 3D or higher-dimensional arrays

The examples so far are for 2D arrays, but flattening can be done in the same way for 3D or higher-dimensional arrays.

a_3d = np.arange(24).reshape(2, 3, 4)
print(a_3d)
# [[[ 0  1  2  3]
#   [ 4  5  6  7]
#   [ 8  9 10 11]]
# 
#  [[12 13 14 15]
#   [16 17 18 19]
#   [20 21 22 23]]]

print(a_3d.ravel())
# [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

print(a_3d.ravel('F'))
# [ 0 12  4 16  8 20  1 13  5 17  9 21  2 14  6 18 10 22  3 15  7 19 11 23]

As shown in the example above, the effect of the order argument can be complicated for multi-dimensional arrays, so it is recommended to test with simple examples before using it.

Related Categories

Related Articles