Count Elements in a List in Python: collections.Counter
In Python, you can count the total number of elements in a list or tuple with the built-in function len()
and the number of occurrences of an element with the count()
method.
In addition, the Counter
class from the standard library's collections module can be used to count the number of occurrences of each element simultaneously.
Lists are used in the following sample code, but tuples can be processed in the same way.
For instructions on how to count specific characters or substrings in a string, refer to the following article.
Count the total number of elements: len()
You can count the total number of elements in a list with the built-in function len()
.
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
print(len(l))
# 7
Count element occurrences: count()
You can count the number of occurrences of a specific element in a list with the count()
method.
Passing a non-existent element returns 0
.
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
print(l.count('a'))
# 4
print(l.count('b'))
# 1
print(l.count('c'))
# 2
print(l.count('d'))
# 0
collections.Counter
, explained next, is useful if you want to count each element's occurrences simultaneously.
How to use collections.Counter
The collections
module in the standard library provides the Counter
class, a subclass of the dictionary (dict
). When you pass a list to collections.Counter()
, it creates a Counter
object, with elements as keys and their counts as values.
import collections
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
c = collections.Counter(l)
print(c)
# Counter({'a': 4, 'c': 2, 'b': 1})
print(type(c))
# <class 'collections.Counter'>
print(issubclass(type(c), dict))
# True
Specifying an element to Counter
returns its count, and if the element doesn't exist, 0
is returned.
print(c['a'])
# 4
print(c['b'])
# 1
print(c['c'])
# 2
print(c['d'])
# 0
Additionally, Counter
supports dict
methods like keys()
, values()
, and items()
.
print(c.keys())
# dict_keys(['a', 'b', 'c'])
print(c.values())
# dict_values([4, 1, 2])
print(c.items())
# dict_items([('a', 4), ('b', 1), ('c', 2)])
These methods return objects such as dict_keys
, which can be used directly in a for
loop. If you want to convert it to a list, use list()
.
Get the most common elements: most_common()
The Counter
's most_common()
method returns a list of (element, count)
tuples sorted by counts.
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
c = collections.Counter(l)
print(c)
# Counter({'a': 4, 'c': 2, 'b': 1})
print(c.most_common())
# [('a', 4), ('c', 2), ('b', 1)]
You can access the most frequent element via [0]
and the least frequent via [-1]
indices. To extract only the element or the count, specify the index accordingly.
print(c.most_common()[0])
# ('a', 4)
print(c.most_common()[-1])
# ('b', 1)
print(c.most_common()[0][0])
# a
print(c.most_common()[0][1])
# 4
To sort in decreasing order of count, use slices with -1
as the increment.
print(c.most_common()[::-1])
# [('b', 1), ('c', 2), ('a', 4)]
Passing an argument n
to most_common()
returns the n
most frequent elements. If n
is omitted, all elements are returned.
print(c.most_common(2))
# [('a', 4), ('c', 2)]
To get separate lists of elements and their counts sorted by the number of occurrences, you can use the following approach.
values, counts = zip(*c.most_common())
print(values)
# ('a', 'c', 'b')
print(counts)
# (4, 2, 1)
This approach employs the built-in function zip()
to transpose a 2D list (a list of tuples here) and to unpack and extract it. The following articles provide more information.
Count unique elements
To count unique elements in a list or a tuple, use Counter
or set()
.
A Counter
object's length equals the number of unique elements in the original list, which can be obtained using len()
.
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
c = collections.Counter(l)
print(len(c))
# 3
You can also use set
. If you don't need the specific features of a Counter
object, using set
may be simpler.
As a set
doesn't contain duplicate elements, calling set()
on a list returns a set
object with unique elements, whose size can be obtained using len()
.
print(set(l))
# {'a', 'c', 'b'}
print(len(set(l)))
# 3
For more information on checking, removing, and extracting duplicate elements in a list, see the following articles.
Count elements that satisfy the conditions
List comprehensions or generator expressions help count the elements in a list or tuple that satisfy specific conditions.
For example, count the number of elements with negative values for the following list.
l = list(range(-5, 6))
print(l)
# [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]
Using list comprehensions with a condition returns a list of boolean values (True
and False
). In Python, bool
is a subclass of int
, treating True
as 1
and False
as 0
. Thus, sum()
can count the number of True
values, or elements meeting the condition.
print([i < 0 for i in l])
# [True, True, True, True, True, False, False, False, False, False, False]
print(sum([i < 0 for i in l]))
# 5
Replacing []
in list comprehensions with ()
turns it into a generator expression. If the generator expression is the only argument in a function call, ()
can be omitted.
print(sum((i < 0 for i in l)))
# 5
print(sum(i < 0 for i in l))
# 5
To count the number of False
values, or elements not meeting the condition, use not
.
print([not (i < 0) for i in l])
# [False, False, False, False, False, True, True, True, True, True, True]
print(sum(not (i < 0) for i in l))
# 6
Of course, you can change the conditions.
print(sum(i >= 0 for i in l))
# 6
Here are some other examples.
Count the number of odd numbers in a list:
print([i % 2 == 1 for i in l])
# [True, False, True, False, True, False, True, False, True, False, True]
print(sum(i % 2 == 1 for i in l))
# 6
For a list of strings:
l = ['apple', 'orange', 'banana']
print([s.endswith('e') for s in l])
# [True, True, False]
print(sum(s.endswith('e') for s in l))
# 2
collections.Counter
helps set the number of occurrences as a condition.
For example, extract elements with at least two occurrences and count the total number of them. In this example, there are four a
and two c
, so a total of six.
l = ['a', 'a', 'a', 'a', 'b', 'c', 'c']
c = collections.Counter(l)
print(c.items())
# dict_items([('a', 4), ('b', 1), ('c', 2)])
print([i for i in l if c[i] >= 2])
# ['a', 'a', 'a', 'a', 'c', 'c']
print([i[1] for i in c.items() if i[1] >= 2])
# [4, 2]
print(sum(i[1] for i in c.items() if i[1] >= 2))
# 6
Alternatively, extract and count the unique elements with at least two occurrences. In this example, there are two unique elements, a
and c
.
print([i[0] for i in c.items() if i[1] >= 2])
# ['a', 'c']
print([i[1] >= 2 for i in c.items()])
# [True, False, True]
print(sum(i[1] >= 2 for i in c.items()))
# 2
Count word occurrences in a string
As a specific example, consider counting the number of word occurrences in a string.
First, the replace()
method removes unwanted ,
and .
characters, and then the split()
method creates a list of words.
s = 'government of the people, by the people, for the people.'
s_remove = s.replace(',', '').replace('.', '')
print(s_remove)
# government of the people by the people for the people
word_list = s_remove.split()
print(word_list)
# ['government', 'of', 'the', 'people', 'by', 'the', 'people', 'for', 'the', 'people']
Once you have a list, you can count occurrences as demonstrated in the previous examples.
print(word_list.count('people'))
# 3
print(len(set(word_list)))
# 6
c = collections.Counter(word_list)
print(c)
# Counter({'the': 3, 'people': 3, 'government': 1, 'of': 1, 'by': 1, 'for': 1})
print(c.most_common()[0][0])
# the
Note that the above is a very simple process, so for more complex natural language processing, consider using a library like NLTK.
Count character occurrences in a string
You can also use the count()
method for a string, or pass it to collections.Counter()
.
s = 'supercalifragilisticexpialidocious'
print(s.count('p'))
# 2
c = collections.Counter(s)
print(c)
# Counter({'i': 7, 's': 3, 'c': 3, 'a': 3, 'l': 3, 'u': 2, 'p': 2, 'e': 2, 'r': 2, 'o': 2, 'f': 1, 'g': 1, 't': 1, 'x': 1, 'd': 1})
To retrieve the top five most frequently appearing characters:
print(c.most_common(5))
# [('i', 7), ('s', 3), ('c', 3), ('a', 3), ('l', 3)]
values, counts = zip(*c.most_common(5))
print(values)
# ('i', 's', 'c', 'a', 'l')