The in operator in Python (for list, string, dictionary, etc.)

Modified: | Tags: Python, List

In Python, the in and not in operators test membership in lists, tuples, dictionaries, and so on.

The in keyword is also used in for statements and list comprehensions. See the following articles for details.

How to use the in operator

Basic usage

x in y returns True if x is included in y, and False otherwise.

print(1 in [0, 1, 2])
# True

print(100 in [0, 1, 2])
# False
source: in_basic.py

The in operator can be used not only with list, but also with other iterable objects such as tuple, set, and range.

print(1 in (0, 1, 2))
# True

print(1 in {0, 1, 2})
# True

print(1 in range(3))
# True
source: in_basic.py

Details about dictionaries (dict) and strings (str) are described later.

Test for value equality

The in operator tests for value equality. It returns True if the values are equal, even if their types are different.

print(1.0 == 1)
# True

print(1.0 in [0, 1, 2])
# True

print(True == 1)
# True

print(True in [0, 1, 2])
# True
source: in_basic.py

Note that bool is a subclass of int, so True and False are equivalent to 1 and 0.

With the if statement

Since the in operator returns a boolean value (True or False), it can be used directly in if statements.

l = [0, 1, 2]
i = 0

if i in l:
    print(f'{i} is a member of {l}.')
else:
    print(f'{i} is not a member of {l}.')
# 0 is a member of [0, 1, 2].
source: in_basic.py
l = [0, 1, 2]
i = 100

if i in l:
    print(f'{i} is a member of {l}.')
else:
    print(f'{i} is not a member of {l}.')
# 100 is not a member of [0, 1, 2].
source: in_basic.py

Note that lists, tuples, strings, and other iterable objects are evaluated as False if empty, and to True otherwise. To check if an object is empty, use the object itself.

l = [0, 1, 2]

if l:
    print(f'{l} is not empty.')
else:
    print(f'{l} is empty.')
# [0, 1, 2] is not empty.
source: in_basic.py
l = []

if l:
    print(f'{l} is not empty.')
else:
    print(f'{l} is empty.')
# [] is empty.
source: in_basic.py

For truth value testing for each type, see the following article:

in for the dictionary (dict)

The in operator for dictionaries (dict) checks for the presence of a key.

d = {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}

print('key1' in d)
# True

print('value1' in d)
# False
source: in_basic.py

Use the values() and items() methods to test for the presence of values or key-value pairs.

print('value1' in d.values())
# True

print(('key1', 'value1') in d.items())
# True

print(('key1', 'value2') in d.items())
# False
source: in_basic.py

See the following article for details.

in for the string (str)

The in operation for strings (str) checks for the presence of a substring.

print('a' in 'abc')
# True

print('x' in 'abc')
# False

print('ab' in 'abc')
# True

print('ac' in 'abc')
# False
source: in_basic.py

Refer to the following article for more information on searching strings, including operations using regular expressions:

not in (negation of in)

x not in y returns the opposite result of x in y.

print(10 in [1, 2, 3])
# False

print(10 not in [1, 2, 3])
# True
source: in_basic.py

The same result is returned by adding not to the entire in operation.

print(not 10 in [1, 2, 3])
# True
source: in_basic.py

However, using not with the entire in operation can lead to ambiguity. To avoid this, it's recommended to use the more explicit not in instead.

print(not (10 in [1, 2, 3]))
# True

print((not 10) in [1, 2, 3])
# False
source: in_basic.py

Since in has higher precedence than not, the expression is interpreted as the former if there are no parentheses.

The latter case is recognized as follows.

print(not 10)
# False

print(False in [1, 2, 3])
# False
source: in_basic.py

in for multiple elements

If you need to check for the inclusion of multiple elements, using a list of those elements, as shown below, will not work. It will test whether the list itself is included or not.

print([0, 1] in [0, 1, 2])
# False

print([0, 1] in [[0, 1], [1, 0]])
# True
source: in_basic.py

Instead, use and, or, or sets.

Use and and or

Combine multiple in operations using and and or. It will test whether both or either of the elements are included.

l = [0, 1, 2]
v1 = 0
v2 = 100

print(v1 in l and v2 in l)
# False

print(v1 in l or v2 in l)
# True

print((v1 in l) or (v2 in l))
# True
source: in_basic.py

Since in and not in have higher precedence than and and or, parentheses are unnecessary. However, if readability is an issue, you can enclose the expression in parentheses, as shown in the last example.

Use sets

If you have many elements to check, using sets is more convenient than and and or.

For example, to check whether list A contains all elements of list B, you can test whether list B is a subset of list A.

l1 = [0, 1, 2, 3, 4]
l2 = [0, 1, 2]
l3 = [0, 1, 5]
l4 = [5, 6, 7]

print(set(l2) <= set(l1))
# True

print(set(l3) <= set(l1))
# False
source: in_basic.py

To check if list A does not contain any elements of list B, you can test whether list A and list B are disjoint.

print(set(l1).isdisjoint(set(l4)))
# True
source: in_basic.py

If list A and list B are not disjoint, it means that list A contains at least one element of list B.

print(not set(l1).isdisjoint(set(l3)))
# True
source: in_basic.py

Sets can also be used to extract common elements from multiple lists. See the following article.

Time complexity of in

The execution speed of the in operator depends on the target object's type. See the following article for time complexity.

This section shows the results of measuring the execution times of in for lists, sets, and dictionaries. The following examples use the Jupyter Notebook magic command %%timeit. Note that these will not work if run as Python scripts.

Consider two lists: one with 10 elements and another with 10,000 elements.

n_small = 10
n_large = 10000

l_small = list(range(n_small))
l_large = list(range(n_large))
source: in_timeit.py

The sample code below is executed in CPython 3.7.4. Results may vary depending on the environment.

Slow for lists: O(n)

The average time complexity of the in operator for lists is O(n). It becomes slower as the number of elements increases.

%%timeit
-1 in l_small
# 178 ns ± 4.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%%timeit
-1 in l_large
# 128 µs ± 11.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
source: in_timeit.py

Execution time varies greatly depending on the position of the target value. It takes the longest when the value is at the end or does not exist.

%%timeit
0 in l_large
# 33.4 ns ± 0.397 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%%timeit
5000 in l_large
# 66.1 µs ± 4.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
9999 in l_large
# 127 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
source: in_timeit.py

Fast for sets: O(1)

The average time complexity of the in operator for sets is O(1). It does not depend on the number of elements.

s_small = set(l_small)
s_large = set(l_large)

%%timeit
-1 in s_small
# 40.4 ns ± 0.572 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%%timeit
-1 in s_large
# 39.4 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
source: in_timeit.py

Execution time does not change depending on the value to look for.

%%timeit
0 in s_large
# 39.7 ns ± 1.27 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%%timeit
5000 in s_large
# 53.1 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%%timeit
9999 in s_large
# 52.4 ns ± 0.403 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
source: in_timeit.py

If you need to perform the in operation repeatedly on a list with many elements, it's more efficient to convert the list to a set beforehand.

%%timeit
for i in range(n_large):
    i in l_large
# 643 ms ± 29.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
s_large_ = set(l_large)
for i in range(n_large):
    i in s_large_
# 746 µs ± 6.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
source: in_timeit.py

Note that it takes time to convert a list to a set, so it may be faster to keep it as a list if the number of in operations is small.

For dictionaries

Take the following dictionary as an example.

d = dict(zip(l_large, l_large))
print(len(d))
# 10000

print(d[0])
# 0

print(d[9999])
# 9999
source: in_timeit.py

As mentioned above, the in operation for the dictionary tests on keys.

The key of the dictionary is a unique value as well as the set, and the execution time is about the same as for sets.

%%timeit
for i in range(n_large):
    i in d
# 756 µs ± 24.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
source: in_timeit.py

On the other hand, dictionary values can be duplicated like a list. The execution time of in for values() is about the same as for lists.

dv = d.values()

%%timeit
for i in range(n_large):
    i in dv
# 990 ms ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
source: in_timeit.py

Key-value pairs are unique. The execution time of in for items() is about set + α.

di = d.items()

%%timeit
for i in range(n_large):
    (i, i) in di
# 1.18 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
source: in_timeit.py

in with for statements and list comprehensions

The in keyword is also used in for statements and list comprehensions.

l = [0, 1, 2]

for i in l:
    print(i)
# 0
# 1
# 2
source: in_basic.py
print([i * 10 for i in l])
# [0, 10, 20]
source: in_basic.py

See the following articles for details on for statements and list comprehensions.

Note that the in operator may be used as a condition in list comprehensions, which is confusing.

l = ['oneXXXaaa', 'twoXXXbbb', 'three999aaa', '000111222']

l_in = [s for s in l if 'XXX' in s]
print(l_in)
# ['oneXXXaaa', 'twoXXXbbb']

The first in is in for the list comprehensions, and the second in is the in operator.

Related Categories

Related Articles