NumPy: Save and load arrays in npy and npz files

Modified: | Tags: Python, NumPy

In NumPy, arrays can be saved as npy and npz files, which are NumPy-specific binary formats preserving essential information like data type (dtype) and shape during both saving and loading processes.

To read and write a text file such as CSV (comma-separated value) or TSV (tab-separated value), see the following article.

The NumPy version used in this article is as follows. Note that functionality may vary between versions.

import numpy as np

print(np.__version__)
# 1.26.1

Pros and cons of npy and npz files

Pros

npy and npz files preserve information such as data type (dtype) and shape.

For example, multi-dimensional arrays of three or more dimensions cannot be directly saved in CSV files without shape conversion. However, npy and npz files save such arrays as is, preserving both their structure and precision without rounding off decimal places.

Cons

While the npy and npz formats are publicly documented, their use is primarily limited to NumPy.

Unlike CSV files, these files cannot be opened and edited in other applications for quick content reviews.

Load npy and npz files: np.load()

To load npy and npz files, use np.load().

Normally, you only need to specify the file path as an argument, but npy (storing a single array) and npz (storing multiple arrays) are treated differently.

The specific usage for each case will be explained together with np.save(), np.savez(), and np.savez_compressed() in the sections below.

Save an array to an npy file: np.save()

np.save() saves a single array to an npy file.

Consider the following ndarray as an example.

a = np.arange(6, dtype=np.int8).reshape(1, 2, 3)
print(a)
# [[[0 1 2]
#   [3 4 5]]]

print(a.shape)
# (1, 2, 3)

print(a.dtype)
# int8

Specify the file path, either as a string or as a pathlib.Path object, as the first argument and the ndarray to be saved as the second.

np.save('data/temp/np_save', a)

The file is saved under the specified path with a .npy extension. If the path already ends with .npy, it remains unchanged.

Loading an npy file with np.load() returns the saved array as an ndarray, preserving its original data type and shape.

a_load = np.load('data/temp/np_save.npy')
print(a_load)
# [[[0 1 2]
#   [3 4 5]]]

print(a_load.shape)
# (1, 2, 3)

print(a_load.dtype)
# int8

Save multiple arrays to an npz file: np.savez()

np.savez() saves multiple arrays into a single npz file, preserving the data type and shape, similar to npy.

Consider the following two arrays as an example.

a1 = np.arange(5)
print(a1)
# [0 1 2 3 4]

a2 = np.arange(5, 10)
print(a2)
# [5 6 7 8 9]

Specify the file path as a string or pathlib.Path object, followed by arrays to save, separated by commas.

While this example demonstrates saving two arrays, you can specify three or more.

np.savez('data/temp/np_savez', a1, a2)

The file is saved under the specified path with a .npz extension. If the path already ends with .npz, it remains unchanged.

Like npy files, npz files are also loaded using np.load(), but it returns an NpzFile object.

npz = np.load('data/temp/np_savez.npz')
print(type(npz))
# <class 'numpy.lib.npyio.NpzFile'>

Access the stored arrays by specifying their names within []. The names of each array can be checked using the files attribute.

print(npz.files)
# ['arr_0', 'arr_1']

print(npz['arr_0'])
# [0 1 2 3 4]

print(npz['arr_1'])
# [5 6 7 8 9]

By default, the names arr_0, arr_1, ..., are assigned in the order of the arrays specified during saving.

Using keyword arguments with np.savez() allows assigning custom names to arrays.

np.savez('data/temp/np_savez_kw', x=a1, y=a2)

npz_kw = np.load('data/temp/np_savez_kw.npz')
print(npz_kw.files)
# ['x', 'y']

print(npz_kw['x'])
# [0 1 2 3 4]

print(npz_kw['y'])
# [5 6 7 8 9]

Though it might be less common, it is also possible to name only some of the arrays using keyword arguments.

np.savez('data/temp/np_savez_kw2', a1, y=a2)

npz_kw2 = np.load('data/temp/np_savez_kw2.npz')
print(npz_kw2.files)
# ['y', 'arr_0']

print(npz_kw2['arr_0'])
# [0 1 2 3 4]

print(npz_kw2['y'])
# [5 6 7 8 9]

Save multiple arrays to compressed npz file: np.savez_compressed()

np.savez_compressed() works like np.savez() but compresses arrays to reduce file size.

The file extension for np.savez_compressed() is .npz, the same as np.savez(), and can be similarly loaded with np.load().

np.savez_compressed('data/temp/np_savez_comp', a1, a2)

npz_comp = np.load('data/temp/np_savez_comp.npz')
print(type(npz_comp))
# <class 'numpy.lib.npyio.NpzFile'>

print(npz_comp.files)
# ['arr_0', 'arr_1']

print(npz_comp['arr_0'])
# [0 1 2 3 4]

print(npz_comp['arr_1'])
# [5 6 7 8 9]

Keyword arguments are also supported.

np.savez_compressed('data/temp/np_savez_comp_kw', x=a1, y=a2)

npz_comp_kw = np.load('data/temp/np_savez_comp_kw.npz')
print(npz_comp_kw.files)
# ['x', 'y']

print(npz_comp_kw['x'])
# [0 1 2 3 4]

print(npz_comp_kw['y'])
# [5 6 7 8 9]

Related Categories

Related Articles