Sort a list of dictionaries by the value of the specific key in Python

Modified: | Tags: Python, List, Dictionary

In Python, sorting a list of dictionaries with the sort() method or the sorted() function raises the error (TypeError) by default.

By specifying the key argument of sort() or sorted(), you can sort a list of dictionaries according to the value of the specific key.

Lists of dictionaries are frequently encountered when reading JSON; see the following article on reading and writing JSON in Python.

Note that a list of dictionaries can be converted to pandas.DataFrame.

Consider the following list of dictionaries with common keys. The pprint module is used to make the output easier to read.

import pprint

l = [{'Name': 'Alice', 'Age': 40, 'Point': 80},
     {'Name': 'Bob', 'Age': 20},
     {'Name': 'Charlie', 'Age': 30, 'Point': 70}]

Sort a list of dictionaries raises an error by default

Sorting a list of dictionaries (dict) with the sort() method or the sorted() function raises the error TypeError by default.

This occurs because dictionaries do not support comparison operations like < or >.

# sorted(l)
# TypeError: '<' not supported between instances of 'dict' and 'dict'

Specify lambda expressions for the key argument

To sort a list of dictionaries according to the value of the specific key, specify the key argument of the sort() method or the sorted() function.

By specifying a function to be applied to each element of the list, it is sorted according to the result of that function. See the following article for more information.

In this example, you can specify a function to get the value of a specific key from the dictionary.

You can define a function with def, but it is convenient to use lambda expressions in such a case.

pprint.pprint(sorted(l, key=lambda x: x['Age']))
# [{'Age': 20, 'Name': 'Bob'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80}]

pprint.pprint(sorted(l, key=lambda x: x['Name']))
# [{'Age': 40, 'Name': 'Alice', 'Point': 80},
#  {'Age': 20, 'Name': 'Bob'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70}]

Use the reverse argument to specify whether the sorting should be in descending or ascending order.

pprint.pprint(sorted(l, key=lambda x: x['Age'], reverse=True))
# [{'Age': 40, 'Name': 'Alice', 'Point': 80},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 20, 'Name': 'Bob'}]

The examples so far use sorted(), but you can specify key and reverse in the same way with the sort() method of list.

For the difference between sort() and sorted(), see the following article. sort() sorts the original object itself, and sorted() creates a new sorted object.

When the specified key does not exist

As shown above, an error is raised if the specified key does not exist.

# sorted(l, key=lambda x: x['Point'])
# KeyError: 'Point'

In such a case, use the get() method of dict, which returns the default value for non-existent keys.

By default, get() returns None for non-existent keys. Since None cannot be compared with a number or a string, an error will be raised.

# sorted(l, key=lambda x: x.get('Point'))
# TypeError: '<' not supported between instances of 'int' and 'NoneType'

You can specify a value for a key that does not exist as the second argument of get(). Elements whose keys do not exist are replaced with the value specified in the second argument and sorted.

pprint.pprint(sorted(l, key=lambda x: x.get('Point', 75)))
# [{'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 20, 'Name': 'Bob'},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80}]

Infinity inf is determined to be greater than any other number, so you can use inf and -inf to always place elements with no key at the end or beginning.

pprint.pprint(sorted(l, key=lambda x: x.get('Point', float('inf'))))
# [{'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80},
#  {'Age': 20, 'Name': 'Bob'}]

pprint.pprint(sorted(l, key=lambda x: x.get('Point', -float('inf'))))
# [{'Age': 20, 'Name': 'Bob'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80}]

Specify operator.itemgetter() for the key argument

You can also use itemgetter() of the operator module of the standard library. It is faster than using a lambda expression.

import operator

pprint.pprint(sorted(l, key=operator.itemgetter('Age')))
# [{'Age': 20, 'Name': 'Bob'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80}]

pprint.pprint(sorted(l, key=operator.itemgetter('Name')))
# [{'Age': 40, 'Name': 'Alice', 'Point': 80},
#  {'Age': 20, 'Name': 'Bob'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70}]

If the specified key does not exist, an error occurs.

# sorted(l, key=operator.itemgetter('Point'))
# KeyError: 'Point'

Sort by multiple keys

The following is an example of a case where dictionaries have the same value for a common key. Two dictionaries have the value 'CA' for the key 'State'.

l_dup = [{'Name': 'Alice', 'Age': 40, 'Point': 80, 'State': 'CA'},
         {'Name': 'Bob', 'Age': 20, 'State': 'NY'},
         {'Name': 'Charlie', 'Age': 30, 'Point': 70, 'State': 'CA'}]

If the values are equal, the original order is preserved.

pprint.pprint(sorted(l_dup, key=operator.itemgetter('State')))
# [{'Age': 40, 'Name': 'Alice', 'Point': 80, 'State': 'CA'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70, 'State': 'CA'},
#  {'Age': 20, 'Name': 'Bob', 'State': 'NY'}]

You can specify multiple arguments for operator.itemgetter(), and if the values for the first key are equal, they will be compared and sorted by the value of the next key.

pprint.pprint(sorted(l_dup, key=operator.itemgetter('State', 'Age')))
# [{'Age': 30, 'Name': 'Charlie', 'Point': 70, 'State': 'CA'},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80, 'State': 'CA'},
#  {'Age': 20, 'Name': 'Bob', 'State': 'NY'}]

Note that if the order of the arguments is different, the result is also different.

pprint.pprint(sorted(l_dup, key=operator.itemgetter('Age', 'State')))
# [{'Age': 20, 'Name': 'Bob', 'State': 'NY'},
#  {'Age': 30, 'Name': 'Charlie', 'Point': 70, 'State': 'CA'},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80, 'State': 'CA'}]

The same can be done with lambda expressions returning multiple values as tuples or lists.

pprint.pprint(sorted(l_dup, key=lambda x: (x['State'], x['Age'])))
# [{'Age': 30, 'Name': 'Charlie', 'Point': 70, 'State': 'CA'},
#  {'Age': 40, 'Name': 'Alice', 'Point': 80, 'State': 'CA'},
#  {'Age': 20, 'Name': 'Bob', 'State': 'NY'}]

max(), min() for a list of dictionaries

As mentioned above, comparisons with < or > are not supported for dictionaries dict, so passing a list of dictionaries to max() or min() causes an error.

l = [{'Name': 'Alice', 'Age': 40, 'Point': 80},
     {'Name': 'Bob', 'Age': 20},
     {'Name': 'Charlie', 'Age': 30, 'Point': 70}]

# max(l)
# TypeError: '>' not supported between instances of 'dict' and 'dict'

As with sorted() and sort(), you can specify the key argument in max() and min() as well.

print(max(l, key=lambda x: x['Age']))
# {'Name': 'Alice', 'Age': 40, 'Point': 80}

print(min(l, key=lambda x: x['Age']))
# {'Name': 'Bob', 'Age': 20}

The max() and min() functions return the dictionary. If you need a specific value, you must specify its key.

print(max(l, key=lambda x: x['Age'])['Age'])
# 40

Of course, you can also use operator.itemgetter().

print(max(l, key=operator.itemgetter('Age')))
# {'Name': 'Alice', 'Age': 40, 'Point': 80}

Related Categories

Related Articles