Random Sampling from a List in Python: random.choice, sample, choices
In Python, you can randomly sample elements from a list using the choice()
, sample()
, and choices()
functions from the random
module. These functions also work with strings and tuples.
choice()
returns a single random element, while sample()
and choices()
return a list of randomly selected elements. sample()
performs random sampling without replacement, while choices()
allows random sampling with replacement.
For information on selecting elements from a list based on specific conditions, refer to the following article.
If you want to shuffle an entire list or create a list of random numbers, see the following articles.
- Shuffle a List, String, Tuple in Python: random.shuffle, sample
- Generate Random Numbers (int and float) in Python
Pick a Random Element: random.choice()
random.choice()
returns a random element from a list.
import random
l = [0, 1, 2, 3, 4]
print(random.choice(l))
# 1
Tuples and strings are also handled similarly to lists. When a string is provided, one character is returned.
print(random.choice(('xxx', 'yyy', 'zzz')))
# yyy
print(random.choice('abcde'))
# b
An error is raised if the list, tuple, or string is empty.
# print(random.choice([]))
# IndexError: Cannot choose from an empty sequence
Random Sample without Replacement: random.sample()
random.sample()
randomly samples multiple elements from a list without replacement, where the first argument is the list and the second is the number of elements to retrieve.
import random
l = [0, 1, 2, 3, 4]
print(random.sample(l, 3))
# [3, 1, 0]
print(type(random.sample(l, 3)))
# <class 'list'>
If the second argument is 1
, the result is a list containing one element. If it is 0
, an empty list is returned. If the value exceeds the number of elements in the list, an error is raised.
print(random.sample(l, 1))
# [1]
print(random.sample(l, 0))
# []
# print(random.sample(l, 10))
# ValueError: Sample larger than population or is negative
Even if you pass a tuple or a string as the first argument, the result is still a list.
print(random.sample(('xxx', 'yyy', 'zzz'), 2))
# ['zzz', 'xxx']
print(random.sample('abcde', 2))
# ['c', 'd']
Use tuple()
or join()
to convert a list into a tuple or a string, respectively.
print(tuple(random.sample(('xxx', 'yyy', 'zzz'), 2)))
# ('zzz', 'yyy')
print(''.join(random.sample('abcde', 2)))
# be
Note that if the original list or tuple contains duplicate elements, the same values may be selected.
l_dup = [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3]
print(random.sample(l_dup, 3))
# [2, 0, 0]
If you want to avoid duplicate values, use set()
to convert lists and tuples to sets, extract unique elements, and then use sample()
.
print(set(l_dup))
# {0, 1, 2, 3}
print(random.sample(list(set(l_dup)), 3))
# [0, 2, 1]
Starting from Python 3.11, passing a set directly to sample()
raises a TypeError
. To use a set, convert it to a list first.
Random Sample with Replacement: random.choices()
random.choices()
randomly samples multiple elements from a list with replacement.
You can specify the number of elements to sample using the k
argument. Since elements are chosen with replacement, k
can be larger than the number of elements in the original list.
Since k
is a keyword-only argument, you must specify it using k=3
.
import random
l = [0, 1, 2, 3, 4]
print(random.choices(l, k=3))
# [2, 1, 0]
print(random.choices(l, k=10))
# [3, 4, 1, 4, 4, 2, 0, 4, 2, 0]
If k
is omitted, it defaults to 1
, and a list containing one element is returned.
print(random.choices(l))
# [1]
You can specify the weight (probability) for each element with the weights
argument. The weights
can be either integers or floats. If a weight is set to 0
, the corresponding element is not selected.
print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1]))
# [0, 2, 3]
print(random.choices(l, k=3, weights=[1, 1, 0, 0, 0]))
# [0, 1, 1]
Cumulative weights can be specified with the cum_weights
argument. cum_weights
in the following code is equivalent to the former weights
in the above code.
print(random.choices(l, k=3, cum_weights=[1, 2, 3, 13, 14]))
# [3, 2, 3]
By default, both weights
and cum_weights
are set to None
, so each element is selected with the same probability.
An error is raised if the length (number of elements) of weights
or cum_weights
doesn't match that of the original list.
# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1, 1, 1]))
# ValueError: The number of weights does not match the population_
Also, an error is raised if you specify weights
and cum_weights
simultaneously.
# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1], cum_weights=[1, 2, 3, 13, 14]))
# TypeError: Cannot specify both weights and cumulative weights
Like the previous functions, random.choices()
works not only with lists but also with tuples and strings.
Fix the Random Seed: random.seed()
You can fix the random seed and initialize the random number generator with random.seed()
.
Setting the same seed ensures that the same elements are selected in the same order every time.
random.seed(0)
print(random.choice(l))
# 3
random.seed(0)
print(random.choice(l))
# 3