note.nkmk.me

Check if the list contains duplicate elements in Python

Posted: 2020-12-09 / Tags: Python, List

This article describes how to check if there are duplicate elements (= if all elements are unique) in a list in Python for the following cases:

  • The list does not contain unhashable objects
  • The list contains unhashable objects

See the following article for how to remove or extract duplicate elements from the list.

Sponsored Link

Check if the list contains duplicate elements (there is no unhashable object)

Use set() if the list does not contain unhashable objects such as list. By passing a list to set(), it returns set, which ignores duplicate values and keeps only unique values as elements.

Get the number of elements of this set and the original list with the built-in function len() and compare.

If the number of elements is the same, it means that there are no duplicate elements in the original list, and if the number of elements is different, it means that the original list contains duplicate elements.

The function that return False when there are no duplicate elements andTrue when there are duplicate elements is as follow:

def has_duplicates(seq):
    return len(seq) != len(set(seq))

l = [0, 1, 2]
print(has_duplicates(l))
# False

l = [0, 1, 1, 2]
print(has_duplicates(l))
# True

The sample code above uses list, but the same function can be used with tuple.

Since set is not allowed to contain unhashable objects such as list, an error TypeError will occur for a list containing a list (two-dimensional list, list of lists).

l_2d = [[0, 1], [1, 1], [0, 1], [1, 0]]
# print(has_duplicates(l_2d))
# TypeError: unhashable type: 'list'

Check if the list contains duplicate elements (there is unhashable object)

In the case of a list including a list, it can be checked whether there are duplicate elements by the following function:

def has_duplicates2(seq):
    seen = []
    unique_list = [x for x in seq if x not in seen and not seen.append(x)]
    return len(seq) != len(unique_list)

l_2d = [[0, 0], [0, 1], [1, 1], [1, 0]]
print(has_duplicates2(l_2d))
# False

l_2d = [[0, 0], [0, 1], [1, 1], [1, 1]]
print(has_duplicates2(l_2d))
# True

Generates a list containing only unique values using list comprehension instead of set() and compares the number of elements. See the following article for details.

This function also works for lists that do not contain unhashable objects like lists.

l = [0, 1, 2]
print(has_duplicates2(l))
# False

l = [0, 1, 1, 2]
print(has_duplicates2(l))
# True

The above example checks if the list contains the same list. You can check if the elements in each list are duplicated by flattening the original list to one dimension and then check if they are duplicated.

l_2d = [[0, 1], [2, 3]]
print(sum(l_2d, []))
# [0, 1, 2, 3]

print(has_duplicates(sum(l_2d, [])))
# False

l_2d = [[0, 1], [2, 0]]
print(has_duplicates(sum(l_2d, [])))
# True

In this example, sum() is used to flatten the list, but you can also use itertools.chain.from_iterable(). If you want to flatten a list with more than three dimensions, you need to define a new function. See the following article.

Sponsored Link
Share

Related Categories

Related Articles