Load, Parse, Serialize JSON Files and Strings in Python
In Python, the json
module allows you to parse JSON files or strings into Python objects, such as dictionaries, and save Python objects as JSON files or strings.
This article doesn't cover all possible arguments for the functions in the json
module. For a comprehensive understanding of these, refer to the official documentation above.
The sample code in this article imports the json
module. It is included in the standard library, so no additional installation is necessary.
import json
Parse JSON strings to Python objects: json.loads()
You can use json.loads()
to convert JSON-formatted strings into Python objects, such as dictionaries.
When a string is passed as the first argument to json.loads()
, it is converted to a Python object, like a dictionary.
s = '{"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null, NaN, Infinity]}'
d = json.loads(s)
print(d)
# {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None, nan, inf]}
print(type(d))
# <class 'dict'>
Here's the default correspondence between JSON and Python objects:
JSON | Python |
---|---|
object | dict |
array | list |
string | str |
number (int) | int |
number (real) | float |
true | True |
false | False |
null | None |
While NaN
and Infinity
are not part of the official JSON specification, Python's json module can still convert them into their corresponding float
values, nan
and inf
.
Keep in mind that true
, false
, null
, NaN
, and Infinity
are case sensitive. A mismatch in case usage will result in an error, for example, using True
instead of true
.
s = '{"A": True}'
# d = json.loads(s)
# JSONDecodeError: Expecting value: line 1 column 7 (char 6)
Note that, starting from Python 3.6, you can also pass a byte string (bytes
) as the first argument to json.loads()
, not just a string (str
). For the sake of brevity, examples are not included here.
Load JSON files as Python objects: json.load()
You can use json.load()
to load JSON files into Python objects, such as dictionaries.
Pass a file object, obtained using the built-in open()
function, as the first argument. The rest of the usage is similar to json.loads()
.
Consider a JSON file containing the same string as in the json.loads()
example above.
with open('data/src/test.json') as f:
print(f.read())
# {"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null, NaN, Infinity]}
You can read this file using json.load()
.
with open('data/src/test.json') as f:
d = json.load(f)
print(d)
# {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None, nan, inf]}
print(type(d))
# <class 'dict'>
Dump Python objects to JSON Strings: json.dumps()
You can use json.dumps()
to convert Python objects, like dictionaries, into JSON-formatted strings.
When a dictionary is passed as the first argument to json.dumps()
, it is converted into a JSON-formatted string.
d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}
s = json.dumps(d)
print(s)
# {"A": {"X": 1, "Y": 1.0, "Z": "abc"}, "B": [true, false, null]}
print(type(s))
# <class 'str'>
Here's the default correspondence between Python objects and JSON:
Python | JSON |
---|---|
dict |
object |
list , tuple |
array |
str |
string |
int , float , int- & float-derived Enums |
number |
True |
true |
False |
false |
None |
null |
By default, non-ASCII characters are output in Unicode escape format. You can prevent this by setting the ensure_ascii
argument to False
. More on this later.
Separators: separators
By default, keys and values are separated by :
, and elements are separated by ,
. You can customize these separators using the separators
argument, allowing you to remove spaces or use other characters.
d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}
print(json.dumps(d, separators=(',', ':')))
# {"A":{"X":1,"Y":1.0,"Z":"abc"},"B":[true,false,null]}
print(json.dumps(d, separators=(' / ', '->')))
# {"A"->{"X"->1 / "Y"->1.0 / "Z"->"abc"} / "B"->[true / false / null]}
Indent level: indent
Specifying the indent level with the indent
argument makes each element have its own line, with indentation.
d = {'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'}, 'B': [True, False, None]}
print(json.dumps(d, indent=4))
# {
# "A": {
# "X": 1,
# "Y": 1.0,
# "Z": "abc"
# },
# "B": [
# true,
# false,
# null
# ]
# }
The default setting is indent=None
, which doesn't include line breaks. When indent=0
, there are line breaks but no indentation.
Sort by key: sort_keys
By setting the sort_keys
argument to True
, the elements of the dictionary are sorted by key. By default, they are unsorted.
d = {'B': {'Y': 2, 'X': 1}, 'A': [3, 1, 2]}
print(json.dumps(d))
# {"B": {"Y": 2, "X": 1}, "A": [3, 1, 2]}
print(json.dumps(d, sort_keys=True))
# {"A": [3, 1, 2], "B": {"X": 1, "Y": 2}}
nan
and inf
: allow_nan
By default, nan
(not a number) and inf
(infinity) are converted to NaN
and Infinity
. Setting the allow_nan
argument to False
will cause an error if nan
or inf
is present.
d = {'A': [float('nan'), float('inf')]}
print(json.dumps(d))
# {"A": [NaN, Infinity]}
# print(json.dumps(d, allow_nan=False))
# ValueError: Out of range float values are not JSON compliant
Unicode escape: ensure_ascii
By default, non-ASCII characters are output in Unicode escape format. By setting the ensure_ascii
argument to False
, these characters will output without Unicode escaping.
d = {'A': 'あいうえお', 'B': 'abc'}
print(json.dumps(d))
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}
print(json.dumps(d, ensure_ascii=False))
# {"A": "あいうえお", "B": "abc"}
Save Python Objects as JSON Files: json.dump()
You can use json.dump()
to save Python objects, such as dictionaries, as JSON files.
A file object, obtained with the built-in open()
function, should be passed as the second argument to json.dump()
. The rest of the usage is the same as json.dumps()
.
Use the open()
function to open the file in write mode ('w'
).
d = {
'A': {'X': 1, 'Y': 1.0, 'Z': 'abc'},
'B': [True, False, None, float('nan'), float('inf')]
}
with open('data/temp/test.json', 'w') as f:
json.dump(d, f, indent=2)
with open('data/temp/test.json') as f:
print(f.read())
# {
# "A": {
# "X": 1,
# "Y": 1.0,
# "Z": "abc"
# },
# "B": [
# true,
# false,
# null,
# NaN,
# Infinity
# ]
# }
If you specify an existing file path, json.dump()
will overwrite that file. If the path doesn't exist, it will create a new file. However, if the specified directory does not exist, a FileNotFoundError
will occur. Ensure the directory exists before running json.dump()
.
Create and update JSON files
Create a new JSON file
Use open()
to create a new file object and json.dump()
to save a Python object.
d_new = {'A': 100, 'B': 'abc', 'C': [True, False]}
with open('data/temp/test_new.json', 'w') as f:
json.dump(d_new, f, indent=2)
with open('data/temp/test_new.json') as f:
print(f.read())
# {
# "A": 100,
# "B": "abc",
# "C": [
# true,
# false
# ]
# }
Update a JSON file
Read an existing JSON file with open()
and json.load()
.
with open('data/temp/test_new.json') as f:
d_update = json.load(f)
print(d_update)
# {'A': 100, 'B': 'abc', 'C': [True, False]}
Update the Python object (dictionary in this case).
- Merge multiple dictionaries and add items to a dictionary in Python
- Remove an item from a dictionary in Python (clear, pop, popitem, del)
d_update['A'] = 200
d_update.pop('B')
d_update['D'] = 'new value'
print(d_update)
# {'A': 200, 'C': [True, False], 'D': 'new value'}
Use open()
to create a new file object with a new path and json.dump()
to save the Python object. If you specify the existing file path instead, the file will be overwritten.
with open('data/temp/test_new_update.json', 'w') as f:
json.dump(d_update, f, indent=2)
with open('data/temp/test_new_update.json') as f:
print(f.read())
# {
# "A": 200,
# "C": [
# true,
# false
# ],
# "D": "new value"
# }
Points to note when handling JSON files and strings
When using the json
module, these points are usually taken care of. However, when processing JSON files or strings without using the json
module, you need to pay attention to the following points.
Quotation marks
In JSON, quotation marks enclosing keys or strings must be double quotes "
. If the quotation marks are single quotes '
, an error will occur in json.load()
and json.loads()
.
For example, if you convert a dictionary object to a string with str()
, single quotes '
are used as quotation marks, and json.loads()
will result in an error.
d = {'A': 100, 'B': 'abc', 'C': [True, False]}
s = str(d)
print(str(d))
# {'A': 100, 'B': 'abc', 'C': [True, False]}
# print(json.loads(s))
# JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
A string converted with json.dumps()
uses double quotes "
for quotation, and can be correctly processed by json.loads()
.
s = json.dumps(d)
print(s)
# {"A": 100, "B": "abc", "C": [true, false]}
print(json.dumps(d))
# {"A": 100, "B": "abc", "C": [true, false]}
Unicode escape for non-ASCII strings
For example, some JSON returned by Web APIs may have non-ASCII characters Unicode escaped for security reasons.
Read files
When reading a Unicode escaped text file using open()
, if the encoding
argument is not set to 'unicode-escape'
, the string will remain in Unicode escape format. This often leads to non-ASCII strings appearing garbled when reading a JSON file.
with open('data/src/test_u.json') as f:
print(f.read())
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}
with open('data/src/test_u.json', encoding='unicode-escape') as f:
print(f.read())
# {"A": "あいうえお", "B": "abc"}
json.load()
appropriately processes Unicode escape sequences \uXXXX
and loads the JSON file as a Python object.
with open('data/src/test_u.json') as f:
print(json.load(f))
# {'A': 'あいうえお', 'B': 'abc'}
Decode byte sequences
The same applies when decoding a Unicode escaped byte sequence to a string. If you don't specify 'unicode-escape'
for the encoding
argument of the decode()
method, the string will remain Unicode escaped.
s = r'{"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}'
b = s.encode()
print(b)
# b'{"A": "\\u3042\\u3044\\u3046\\u3048\\u304a", "B": "abc"}'
print(b.decode())
# {"A": "\u3042\u3044\u3046\u3048\u304a", "B": "abc"}
print(b.decode(encoding='unicode-escape'))
# {"A": "あいうえお", "B": "abc"}
json.loads()
appropriately processes Unicode escape sequences \uXXXX
and converts the byte sequence into a Python object.
print(json.loads(b))
# {'A': 'あいうえお', 'B': 'abc'}