NumPy: Save and load arrays in npy and npz files
In NumPy, arrays can be saved as npy
and npz
files, which are NumPy-specific binary formats preserving essential information like data type (dtype
) and shape during both saving and loading processes.
To read and write a text file such as CSV (comma-separated value) or TSV (tab-separated value), see the following article.
The NumPy version used in this article is as follows. Note that functionality may vary between versions.
import numpy as np
print(np.__version__)
# 1.26.1
Pros and cons of npy
and npz
files
Pros
npy
and npz
files preserve information such as data type (dtype
) and shape.
For example, multi-dimensional arrays of three or more dimensions cannot be directly saved in CSV files without shape conversion. However, npy
and npz
files save such arrays as is, preserving both their structure and precision without rounding off decimal places.
Cons
While the npy
and npz
formats are publicly documented, their use is primarily limited to NumPy.
Unlike CSV files, these files cannot be opened and edited in other applications for quick content reviews.
Load npy
and npz
files: np.load()
To load npy
and npz
files, use np.load()
.
Normally, you only need to specify the file path as an argument, but npy
(storing a single array) and npz
(storing multiple arrays) are treated differently.
The specific usage for each case will be explained together with np.save()
, np.savez()
, and np.savez_compressed()
in the sections below.
Save an array to an npy
file: np.save()
np.save()
saves a single array to an npy
file.
Consider the following ndarray
as an example.
a = np.arange(6, dtype=np.int8).reshape(1, 2, 3)
print(a)
# [[[0 1 2]
# [3 4 5]]]
print(a.shape)
# (1, 2, 3)
print(a.dtype)
# int8
Specify the file path, either as a string or as a pathlib.Path
object, as the first argument and the ndarray
to be saved as the second.
np.save('data/temp/np_save', a)
The file is saved under the specified path with a .npy
extension. If the path already ends with .npy
, it remains unchanged.
Loading an npy
file with np.load()
returns the saved array as an ndarray
, preserving its original data type and shape.
a_load = np.load('data/temp/np_save.npy')
print(a_load)
# [[[0 1 2]
# [3 4 5]]]
print(a_load.shape)
# (1, 2, 3)
print(a_load.dtype)
# int8
Save multiple arrays to an npz
file: np.savez()
np.savez()
saves multiple arrays into a single npz
file, preserving the data type and shape, similar to npy
.
Consider the following two arrays as an example.
a1 = np.arange(5)
print(a1)
# [0 1 2 3 4]
a2 = np.arange(5, 10)
print(a2)
# [5 6 7 8 9]
Specify the file path as a string or pathlib.Path
object, followed by arrays to save, separated by commas.
While this example demonstrates saving two arrays, you can specify three or more.
np.savez('data/temp/np_savez', a1, a2)
The file is saved under the specified path with a .npz
extension. If the path already ends with .npz
, it remains unchanged.
Like npy
files, npz
files are also loaded using np.load()
, but it returns an NpzFile
object.
npz = np.load('data/temp/np_savez.npz')
print(type(npz))
# <class 'numpy.lib.npyio.NpzFile'>
Access the stored arrays by specifying their names within []
. The names of each array can be checked using the files
attribute.
print(npz.files)
# ['arr_0', 'arr_1']
print(npz['arr_0'])
# [0 1 2 3 4]
print(npz['arr_1'])
# [5 6 7 8 9]
By default, the names arr_0
, arr_1
, ..., are assigned in the order of the arrays specified during saving.
Using keyword arguments with np.savez()
allows assigning custom names to arrays.
np.savez('data/temp/np_savez_kw', x=a1, y=a2)
npz_kw = np.load('data/temp/np_savez_kw.npz')
print(npz_kw.files)
# ['x', 'y']
print(npz_kw['x'])
# [0 1 2 3 4]
print(npz_kw['y'])
# [5 6 7 8 9]
Though it might be less common, it is also possible to name only some of the arrays using keyword arguments.
np.savez('data/temp/np_savez_kw2', a1, y=a2)
npz_kw2 = np.load('data/temp/np_savez_kw2.npz')
print(npz_kw2.files)
# ['y', 'arr_0']
print(npz_kw2['arr_0'])
# [0 1 2 3 4]
print(npz_kw2['y'])
# [5 6 7 8 9]
Save multiple arrays to compressed npz
file: np.savez_compressed()
np.savez_compressed()
works like np.savez()
but compresses arrays to reduce file size.
The file extension for np.savez_compressed()
is .npz
, the same as np.savez()
, and can be similarly loaded with np.load()
.
np.savez_compressed('data/temp/np_savez_comp', a1, a2)
npz_comp = np.load('data/temp/np_savez_comp.npz')
print(type(npz_comp))
# <class 'numpy.lib.npyio.NpzFile'>
print(npz_comp.files)
# ['arr_0', 'arr_1']
print(npz_comp['arr_0'])
# [0 1 2 3 4]
print(npz_comp['arr_1'])
# [5 6 7 8 9]
Keyword arguments are also supported.
np.savez_compressed('data/temp/np_savez_comp_kw', x=a1, y=a2)
npz_comp_kw = np.load('data/temp/np_savez_comp_kw.npz')
print(npz_comp_kw.files)
# ['x', 'y']
print(npz_comp_kw['x'])
# [0 1 2 3 4]
print(npz_comp_kw['y'])
# [5 6 7 8 9]