The in operator in Python (for list, string, dictionary, etc.)
In Python, the in
and not in
operators test membership in lists, tuples, dictionaries, and so on.
The in
keyword is also used in for
statements and list comprehensions. See the following articles for details.
How to use the in
operator
Basic usage
x in y
returns True
if x
is included in y
, and False
otherwise.
print(1 in [0, 1, 2])
# True
print(100 in [0, 1, 2])
# False
The in
operator can be used not only with list
, but also with other iterable objects such as tuple
, set
, and range
.
print(1 in (0, 1, 2))
# True
print(1 in {0, 1, 2})
# True
print(1 in range(3))
# True
Details about dictionaries (dict
) and strings (str
) are described later.
Test for value equality
The in
operator tests for value equality. It returns True
if the values are equal, even if their types are different.
print(1.0 == 1)
# True
print(1.0 in [0, 1, 2])
# True
print(True == 1)
# True
print(True in [0, 1, 2])
# True
Note that bool
is a subclass of int
, so True
and False
are equivalent to 1
and 0
.
With the if
statement
Since the in
operator returns a boolean value (True
or False
), it can be used directly in if
statements.
l = [0, 1, 2]
i = 0
if i in l:
print(f'{i} is a member of {l}.')
else:
print(f'{i} is not a member of {l}.')
# 0 is a member of [0, 1, 2].
l = [0, 1, 2]
i = 100
if i in l:
print(f'{i} is a member of {l}.')
else:
print(f'{i} is not a member of {l}.')
# 100 is not a member of [0, 1, 2].
Note that lists, tuples, strings, and other iterable objects are evaluated as False
if empty, and to True
otherwise. To check if an object is empty, use the object itself.
l = [0, 1, 2]
if l:
print(f'{l} is not empty.')
else:
print(f'{l} is empty.')
# [0, 1, 2] is not empty.
l = []
if l:
print(f'{l} is not empty.')
else:
print(f'{l} is empty.')
# [] is empty.
For truth value testing for each type, see the following article:
in
for the dictionary (dict
)
The in
operator for dictionaries (dict
) checks for the presence of a key.
d = {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}
print('key1' in d)
# True
print('value1' in d)
# False
Use the values()
and items()
methods to test for the presence of values or key-value pairs.
print('value1' in d.values())
# True
print(('key1', 'value1') in d.items())
# True
print(('key1', 'value2') in d.items())
# False
See the following article for details.
in
for the string (str
)
The in
operation for strings (str
) checks for the presence of a substring.
print('a' in 'abc')
# True
print('x' in 'abc')
# False
print('ab' in 'abc')
# True
print('ac' in 'abc')
# False
Refer to the following article for more information on searching strings, including operations using regular expressions:
not in
(negation of in
)
x not in y
returns the opposite result of x in y
.
print(10 in [1, 2, 3])
# False
print(10 not in [1, 2, 3])
# True
The same result is returned by adding not
to the entire in
operation.
print(not 10 in [1, 2, 3])
# True
However, using not
with the entire in
operation can lead to ambiguity. To avoid this, it's recommended to use the more explicit not in
instead.
print(not (10 in [1, 2, 3]))
# True
print((not 10) in [1, 2, 3])
# False
Since in
has higher precedence than not
, the expression is interpreted as the former if there are no parentheses.
The latter case is recognized as follows.
print(not 10)
# False
print(False in [1, 2, 3])
# False
in
for multiple elements
If you need to check for the inclusion of multiple elements, using a list of those elements, as shown below, will not work. It will test whether the list itself is included or not.
print([0, 1] in [0, 1, 2])
# False
print([0, 1] in [[0, 1], [1, 0]])
# True
Instead, use and
, or
, or sets.
Use and
and or
Combine multiple in
operations using and
and or
. It will test whether both or either of the elements are included.
l = [0, 1, 2]
v1 = 0
v2 = 100
print(v1 in l and v2 in l)
# False
print(v1 in l or v2 in l)
# True
print((v1 in l) or (v2 in l))
# True
Since in
and not in
have higher precedence than and
and or
, parentheses are unnecessary. However, if readability is an issue, you can enclose the expression in parentheses, as shown in the last example.
Use sets
If you have many elements to check, using sets is more convenient than and
and or
.
For example, to check whether list A
contains all elements of list B
, you can test whether list B
is a subset of list A
.
l1 = [0, 1, 2, 3, 4]
l2 = [0, 1, 2]
l3 = [0, 1, 5]
l4 = [5, 6, 7]
print(set(l2) <= set(l1))
# True
print(set(l3) <= set(l1))
# False
To check if list A
does not contain any elements of list B
, you can test whether list A
and list B
are disjoint.
print(set(l1).isdisjoint(set(l4)))
# True
If list A
and list B
are not disjoint, it means that list A
contains at least one element of list B
.
print(not set(l1).isdisjoint(set(l3)))
# True
Sets can also be used to extract common elements from multiple lists. See the following article.
Time complexity of in
The execution speed of the in
operator depends on the target object's type. See the following article for time complexity.
This section shows the results of measuring the execution times of in
for lists, sets, and dictionaries. The following examples use the Jupyter Notebook magic command %%timeit
. Note that these will not work if run as Python scripts.
Consider two lists: one with 10 elements and another with 10,000 elements.
n_small = 10
n_large = 10000
l_small = list(range(n_small))
l_large = list(range(n_large))
The sample code below is executed in CPython 3.7.4. Results may vary depending on the environment.
Slow for lists: O(n)
The average time complexity of the in
operator for lists is O(n)
. It becomes slower as the number of elements increases.
%%timeit
-1 in l_small
# 178 ns ± 4.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%%timeit
-1 in l_large
# 128 µs ± 11.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Execution time varies greatly depending on the position of the target value. It takes the longest when the value is at the end or does not exist.
%%timeit
0 in l_large
# 33.4 ns ± 0.397 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
5000 in l_large
# 66.1 µs ± 4.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
9999 in l_large
# 127 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Fast for sets: O(1)
The average time complexity of the in
operator for sets is O(1)
. It does not depend on the number of elements.
s_small = set(l_small)
s_large = set(l_large)
%%timeit
-1 in s_small
# 40.4 ns ± 0.572 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
-1 in s_large
# 39.4 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Execution time does not change depending on the value to look for.
%%timeit
0 in s_large
# 39.7 ns ± 1.27 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
5000 in s_large
# 53.1 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
9999 in s_large
# 52.4 ns ± 0.403 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
If you need to perform the in
operation repeatedly on a list with many elements, it's more efficient to convert the list to a set beforehand.
%%timeit
for i in range(n_large):
i in l_large
# 643 ms ± 29.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
s_large_ = set(l_large)
for i in range(n_large):
i in s_large_
# 746 µs ± 6.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Note that it takes time to convert a list to a set, so it may be faster to keep it as a list if the number of in
operations is small.
For dictionaries
Take the following dictionary as an example.
d = dict(zip(l_large, l_large))
print(len(d))
# 10000
print(d[0])
# 0
print(d[9999])
# 9999
As mentioned above, the in
operation for the dictionary tests on keys.
The key of the dictionary is a unique value as well as the set, and the execution time is about the same as for sets.
%%timeit
for i in range(n_large):
i in d
# 756 µs ± 24.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
On the other hand, dictionary values can be duplicated like a list. The execution time of in
for values()
is about the same as for lists.
dv = d.values()
%%timeit
for i in range(n_large):
i in dv
# 990 ms ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Key-value pairs are unique. The execution time of in
for items()
is about set
+ α.
di = d.items()
%%timeit
for i in range(n_large):
(i, i) in di
# 1.18 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
in
with for
statements and list comprehensions
The in
keyword is also used in for
statements and list comprehensions.
l = [0, 1, 2]
for i in l:
print(i)
# 0
# 1
# 2
print([i * 10 for i in l])
# [0, 10, 20]
See the following articles for details on for
statements and list comprehensions.
Note that the in
operator may be used as a condition in list comprehensions, which is confusing.
l = ['oneXXXaaa', 'twoXXXbbb', 'three999aaa', '000111222']
l_in = [s for s in l if 'XXX' in s]
print(l_in)
# ['oneXXXaaa', 'twoXXXbbb']
The first in
is in
for the list comprehensions, and the second in
is the in
operator.