Search for a String in Python (Check If a Substring Is Included/Get a Substring Position)

Modified: 2023-05-07 | Tags: Python, String, Regex

This article explains how to search a string to check if it contains a specific substring and to get its location in Python. The re module in the standard library allows more flexible operation with regular expressions.

Contents

Check if a string contains a given substring: in
Get the position (index) of a given substring: find(), rfind()
Case-insensitive search
Check and get a position with regex: re.search()
Get all results with regex: re.findall(), re.finditer()
Search multiple strings with regex
Use special characters and sequences
Case-insensitive search with regex: re.IGNORECASE

See the following article on how to count specific characters or substrings in a string.

Count characters and strings in Python

See the following articles on how to extract, replace, and compare strings.

If you want to search the contents of a text file, read the file as a string.

Read, write, and create files in Python (with and open())

Check if a string contains a given substring: `in`

Use the in operator to check if a string contains a given substring.

The in operator is case-sensitive, and the same applies to the string methods described below. You can check for the presence of multiple substrings using and and or.

Boolean operators in Python (and, or, not)

s = 'I am Sam'

print('Sam' in s)
# True

print('sam' in s)
# False

print('I' in s and 'Sam' in s)
# True

source: str_in_find_rfind.py

For more complex operations, consider using regular expressions, as described in the following sections.

Note that the in operator can also be used for lists, tuples, and dictionaries. See the following article for details.

The in operator in Python (for list, string, dictionary, etc.)

Get the position (index) of a given substring: `find()`, `rfind()`

You can get the position of a given substring in the string with the find() method of str.

Built-in Types - str.find() — Python 3.11.3 documentation

If the substring specified as the first argument is found, the method returns its starting position (the position of the first character); if not found, -1 is returned.

s = 'I am Sam'

print(s.find('Sam'))
# 5

print(s.find('XXX'))
# -1

source: str_in_find_rfind.py

In Python, the index of the first character in a string is 0.

I am Sam
01234567

If there are multiple occurrences of the substring, the position of the first occurrence (the leftmost substring) is returned.

To find all occurrences, you can adjust the range with the start and end arguments; however, using the regex approach described below is more convenient.

print(s.find('am'))
# 2

source: str_in_find_rfind.py

By specifying the second argument start and the third argument end, the search will be limited to the range of the slice [start:end].

How to slice a list, string, tuple in Python

print(s.find('am', 3))
# 6

print(s.find('am', 3, 5))
# -1

source: str_in_find_rfind.py

The rfind() method searches the string starting from the right side.

Built-in Types - str.rfind() — Python 3.11.3 documentation

If multiple substrings are present, the position of the rightmost substring is returned. Similar to find(), you can also specify start and end arguments for the rfind() method.

print(s.rfind('am'))
# 6

print(s.rfind('XXX'))
# -1

print(s.rfind('am', 2))
# 6

print(s.rfind('am', 2, 5))
# 2

source: str_in_find_rfind.py

There are index() and rindex() methods similar to find() and rfind(). If the specified string does not exist, find() and rfind() return -1, but index() and rindex() raise an error.

print(s.index('am'))
# 2

# print(s.index('XXX'))
# ValueError: substring not found

print(s.rindex('am'))
# 6

# print(s.rindex('XXX'))
# ValueError: substring not found

source: str_in_find_rfind.py

Case-insensitive search

Note that the in operator and the string methods mentioned so far are case-sensitive.

For case-insensitive searches, you can convert both the search string and target string to uppercase or lowercase. Use the upper() method to convert a string to uppercase, and the lower() method to convert it to lowercase.

Uppercase and lowercase strings in Python (conversion and checking)

s = 'I am Sam'

print(s.upper())
# I AM SAM

print(s.lower())
# i am sam

print('sam' in s)
# False

print('sam' in s.lower())
# True

print(s.find('sam'))
# -1

print(s.lower().find('sam'))
# 5

source: str_in_find_rfind.py

Check and get a position with regex: `re.search()`

Use regular expressions with the re module of the standard library.

Regular expressions with the re module in Python

Use re.search() to check if a string contains a given string with regex.

The first argument is a regex pattern, and the second is a target string. Although special characters and sequences can be used in the regex pattern, the following example demonstrates the simplest pattern by using the string as it is.

If the pattern matches, a match object is returned; otherwise, None is returned.

import re

s = 'I am Sam'

print(re.search('Sam', s))
# <re.Match object; span=(5, 8), match='Sam'>

print(re.search('XXX', s))
# None

source: str_search_regex.py

You can get various information with the methods of the match object.

How to use regex match objects in Python

group() returns the matched string, start() returns the start position, end() returns the end position, and span() returns a tuple of (start position, end position).

m = re.search('Sam', s)

print(m.group())
# Sam

print(m.start())
# 5

print(m.end())
# 8

print(m.span())
# (5, 8)

source: str_search_regex.py

Get all results with regex: `re.findall()`, `re.finditer()`

re.search() returns only the first match object, even if there are multiple matching occurrences in the string.

s = 'I am Sam'

print(re.search('am', s))
# <re.Match object; span=(2, 4), match='am'>

source: str_search_regex.py

re.findall() returns all matching parts as a list of strings.

print(re.findall('am', s))
# ['am', 'am']

source: str_search_regex.py

To get the positions of all matching parts, use re.finditer() along with list comprehensions.

List comprehensions in Python

print([m.span() for m in re.finditer('am', s)])
# [(2, 4), (6, 8)]

source: str_search_regex.py

In the above example, span() is used so that a list of tuples, (start position, end position), is returned. If you want to get a list of only start or end positions, use start() or end().

Note that re.finditer() returns an iterator yielding match objects over all matches.

Search multiple strings with regex

Even if you do not have much experience with regular expressions, it is helpful to know the | symbol.

If the regex pattern is A|B, it matches A or B. You can use just a string for A and B (of course, you can use special characters and sequences), and you can use A|B|C for three or more.

You can search for multiple strings as follows.

s = 'I am Sam Adams'

print(re.findall('Sam|Adams', s))
# ['Sam', 'Adams']

print([m.span() for m in re.finditer('Sam|Adams', s)])
# [(5, 8), (9, 14)]

source: str_search_regex.py

Use special characters and sequences

Using special characters and sequences in regex patterns allows for more complex searches.

s = 'I am Sam Adams'

print(re.findall('am', s))
# ['am', 'am', 'am']

print(re.findall('[a-zA-Z]+am[a-z]*', s))
# ['Sam', 'Adams']

source: str_search_regex.py

See the following article for basic examples of utilizing regex patterns, such as wildcard-like patterns.

Extract a substring from a string in Python (position, regex)

Case-insensitive search with regex: `re.IGNORECASE`

You can specify re.IGNORECASE as the flags argument of functions such as re.search() andre.findall() to search case-insensitive.

s = 'I am Sam'

print(re.search('sam', s))
# None

print(re.search('sam', s, flags=re.IGNORECASE))
# <re.Match object; span=(5, 8), match='Sam'>

source: str_search_regex.py

Search for a String in Python (Check If a Substring Is Included/Get a Substring Position)

Check if a string contains a given substring: `in`

Get the position (index) of a given substring: `find()`, `rfind()`

Case-insensitive search

Check and get a position with regex: `re.search()`

Get all results with regex: `re.findall()`, `re.finditer()`

Search multiple strings with regex

Use special characters and sequences

Case-insensitive search with regex: `re.IGNORECASE`

Related Categories

Related Articles

Search for a String in Python (Check If a Substring Is Included/Get a Substring Position)

Check if a string contains a given substring: in

Get the position (index) of a given substring: find(), rfind()

Case-insensitive search

Check and get a position with regex: re.search()

Get all results with regex: re.findall(), re.finditer()

Search multiple strings with regex

Use special characters and sequences

Case-insensitive search with regex: re.IGNORECASE

Related Categories

Related Articles

Check if a string contains a given substring: `in`

Get the position (index) of a given substring: `find()`, `rfind()`

Check and get a position with regex: `re.search()`

Get all results with regex: `re.findall()`, `re.finditer()`

Case-insensitive search with regex: `re.IGNORECASE`