Load, Parse, Serialize JSON Files and Strings in Python

Posted: | Tags: Python, JSON

In Python, the json module allows you to parse JSON files or strings into Python objects, such as dictionaries, and save Python objects as JSON files or strings.

This article doesn't cover all possible arguments for the functions in the json module. For a comprehensive understanding of these, refer to the official documentation above.

The sample code in this article imports the json module. It is included in the standard library, so no additional installation is necessary.

import json

Parse JSON strings to Python objects: json.loads()

You can use json.loads() to convert JSON-formatted strings into Python objects, such as dictionaries.

When a string is passed as the first argument to json.loads(), it is converted to a Python object, like a dictionary.

s = '{"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null, NaN, Infinity]}'

d = json.loads(s)
print(d)
# {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None, nan, inf]}

print(type(d))
# <class 'dict'>

Here's the default correspondence between JSON and Python objects:

JSON Python
object dict
array list
string str
number (int) int
number (real) float
true True
false False
null None

While NaN and Infinity are not part of the official JSON specification, Python's json module can still convert them into their corresponding float values, nan and inf.

Keep in mind that true, false, null, NaN, and Infinity are case sensitive. A mismatch in case usage will result in an error, for example, using True instead of true.

s = '{"A": True}'

# d = json.loads(s)
# JSONDecodeError: Expecting value: line 1 column 7 (char 6)

Note that, starting from Python 3.6, you can also pass a byte string (bytes) as the first argument to json.loads(), not just a string (str). For the sake of brevity, examples are not included here.

Load JSON files as Python objects: json.load()

You can use json.load() to load JSON files into Python objects, such as dictionaries.

Pass a file object, obtained using the built-in open() function, as the first argument. The rest of the usage is similar to json.loads().

Consider a JSON file containing the same string as in the json.loads() example above.

with open('data/src/test.json') as f:
    print(f.read())
# {"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null, NaN, Infinity]}

You can read this file using json.load().

with open('data/src/test.json') as f:
    d = json.load(f)

print(d)
# {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None, nan, inf]}

print(type(d))
# <class 'dict'>

Dump Python objects to JSON Strings: json.dumps()

You can use json.dumps() to convert Python objects, like dictionaries, into JSON-formatted strings.

When a dictionary is passed as the first argument to json.dumps(), it is converted into a JSON-formatted string.

d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}

s = json.dumps(d)
print(s)
# {"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null]}

print(type(s))
# <class 'str'>

Here's the default correspondence between Python objects and JSON:

Python JSON
dict object
list, tuple array
str string
int, float, int- & float-derived Enums number
True true
False false
None null

By default, non-ASCII characters are output in Unicode escape format. You can prevent this by setting the ensure_ascii argument to False. More on this later.

Separators: separators

By default, keys and values are separated by :, and elements are separated by ,. You can customize these separators using the separators argument, allowing you to remove spaces or use other characters.

d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}
print(json.dumps(d, separators=(',', ':')))
# {"A":{"X":1,"Y":1.0,"Z":"abc"},"B":[true,false,null]}

print(json.dumps(d, separators=(' / ', '->')))
# {"A"->{"X"->1 / "Y"->1.0 / "Z"->"abc"} / "B"->[true / false / null]}

Indent level: indent

Specifying the indent level with the indent argument makes each element have its own line, with indentation.

d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}
print(json.dumps(d, indent=4))
# {
#     "A": {
#         "X": 1,
#         "Y": 1.0,
#         "Z": "abc"
#     },
#     "B": [
#         true,
#         false,
#         null
#     ]
# }

The default setting is indent=None, which doesn't include line breaks. When indent=0, there are line breaks but no indentation.

Sort by key: sort_keys

By setting the sort_keys argument to True, the elements of the dictionary are sorted by key. By default, they are unsorted.

d = {'B': {'Y': 2, 'X': 1}, 'A': [3, 1, 2]}
print(json.dumps(d))
# {"B": {"Y": 2, "X": 1}, "A": [3, 1, 2]}

print(json.dumps(d, sort_keys=True))
# {"A": [3, 1, 2], "B": {"X": 1, "Y": 2}}

nan and inf: allow_nan

By default, nan (not a number) and inf (infinity) are converted to NaN and Infinity. Setting the allow_nan argument to False will cause an error if nan or inf is present.

d = {'A': [float('nan'), float('inf')]}
print(json.dumps(d))
# {"A": [NaN, Infinity]}

# print(json.dumps(d, allow_nan=False))
# ValueError: Out of range float values are not JSON compliant

Unicode escape: ensure_ascii

By default, non-ASCII characters are output in Unicode escape format. By setting the ensure_ascii argument to False, these characters will output without Unicode escaping.

d = {'A': 'あいうえお', 'B': 'abc'}
print(json.dumps(d))
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}

print(json.dumps(d, ensure_ascii=False))
# {"A": "あいうえお", "B": "abc"}

Save Python Objects as JSON Files: json.dump()

You can use json.dump() to save Python objects, such as dictionaries, as JSON files.

A file object, obtained with the built-in open() function, should be passed as the second argument to json.dump(). The rest of the usage is the same as json.dumps().

Use the open() function to open the file in write mode ('w').

d = {
    'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'},
    'B': [True, False, None, float('nan'), float('inf')]
}

with open('data/temp/test.json', 'w') as f:
    json.dump(d, f, indent=2)
with open('data/temp/test.json') as f:
    print(f.read())
# {
#   "A": {
#     "X": 1,
#     "Y": 1.0,
#     "Z": "abc"
#   },
#   "B": [
#     true,
#     false,
#     null,
#     NaN,
#     Infinity
#   ]
# }

If you specify an existing file path, json.dump() will overwrite that file. If the path doesn't exist, it will create a new file. However, if the specified directory does not exist, a FileNotFoundError will occur. Ensure the directory exists before running json.dump().

Create and update JSON files

Create a new JSON file

Use open() to create a new file object and json.dump() to save a Python object.

d_new = {'A': 100, 'B': 'abc', 'C': [True, False]}

with open('data/temp/test_new.json', 'w') as f:
    json.dump(d_new, f, indent=2)
with open('data/temp/test_new.json') as f:
    print(f.read())
# {
#   "A": 100,
#   "B": "abc",
#   "C": [
#     true,
#     false
#   ]
# }

Update a JSON file

Read an existing JSON file with open() and json.load().

with open('data/temp/test_new.json') as f:
    d_update = json.load(f)

print(d_update)
# {'A': 100, 'B': 'abc', 'C': [True, False]}

Update the Python object (dictionary in this case).

d_update['A'] = 200
d_update.pop('B')
d_update['D'] = 'new value'

print(d_update)
# {'A': 200, 'C': [True, False], 'D': 'new value'}

Use open() to create a new file object with a new path and json.dump() to save the Python object. If you specify the existing file path instead, the file will be overwritten.

with open('data/temp/test_new_update.json', 'w') as f:
    json.dump(d_update, f, indent=2)
with open('data/temp/test_new_update.json') as f:
    print(f.read())
# {
#   "A": 200,
#   "C": [
#     true,
#     false
#   ],
#   "D": "new value"
# }

Points to note when handling JSON files and strings

When using the json module, these points are usually taken care of. However, when processing JSON files or strings without using the json module, you need to pay attention to the following points.

Quotation marks

In JSON, quotation marks enclosing keys or strings must be double quotes ". If the quotation marks are single quotes ', an error will occur in json.load() and json.loads().

For example, if you convert a dictionary object to a string with str(), single quotes ' are used as quotation marks, and json.loads() will result in an error.

d = {'A': 100, 'B': 'abc', 'C': [True, False]}

s = str(d)
print(str(d))
# {'A': 100, 'B': 'abc', 'C': [True, False]}

# print(json.loads(s))
# JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

A string converted with json.dumps() uses double quotes " for quotation, and can be correctly processed by json.loads().

s = json.dumps(d)
print(s)
# {"A": 100, "B": "abc", "C": [true, false]}

print(json.dumps(d))
# {"A": 100, "B": "abc", "C": [true, false]}

Unicode escape for non-ASCII strings

For example, some JSON returned by Web APIs may have non-ASCII characters Unicode escaped for security reasons.

Read files

When reading a Unicode escaped text file using open(), if the encoding argument is not set to 'unicode-escape', the string will remain in Unicode escape format. This often leads to non-ASCII strings appearing garbled when reading a JSON file.

with open('data/src/test_u.json') as f:
    print(f.read())
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}

with open('data/src/test_u.json', encoding='unicode-escape') as f:
    print(f.read())
# {"A": "あいうえお", "B": "abc"}

json.load() appropriately processes Unicode escape sequences \uXXXX and loads the JSON file as a Python object.

with open('data/src/test_u.json') as f:
    print(json.load(f))
# {'A': 'あいうえお', 'B': 'abc'}

Decode byte sequences

The same applies when decoding a Unicode escaped byte sequence to a string. If you don't specify 'unicode-escape' for the encoding argument of the decode() method, the string will remain Unicode escaped.

s = r'{"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}'
b = s.encode()
print(b)
# b'{"A": "\\u3042\\u3044\\u3046\\u3048\\u304a", "B": "abc"}'

print(b.decode())
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}

print(b.decode(encoding='unicode-escape'))
# {"A": "あいうえお", "B": "abc"}

json.loads() appropriately processes Unicode escape sequences \uXXXX and converts the byte sequence into a Python object.

print(json.loads(b))
# {'A': 'あいうえお', 'B': 'abc'}

Related Categories

Related Articles