Random Sampling from a List in Python: random.choice, sample, choices

Modified: | Tags: Python, List

In Python, you can randomly sample elements from a list using the choice(), sample(), and choices() functions from the random module. These functions also work with strings and tuples.

choice() returns a single random element, while sample() and choices() return a list of randomly selected elements. sample() performs random sampling without replacement, while choices() allows random sampling with replacement.

For information on selecting elements from a list based on specific conditions, refer to the following article.

If you want to shuffle an entire list or create a list of random numbers, see the following articles.

Pick a Random Element: random.choice()

random.choice() returns a random element from a list.

import random

l = [0, 1, 2, 3, 4]

print(random.choice(l))
# 1

Tuples and strings are also handled similarly to lists. When a string is provided, one character is returned.

print(random.choice(('xxx', 'yyy', 'zzz')))
# yyy

print(random.choice('abcde'))
# b

An error is raised if the list, tuple, or string is empty.

# print(random.choice([]))
# IndexError: Cannot choose from an empty sequence

Random Sample without Replacement: random.sample()

random.sample() randomly samples multiple elements from a list without replacement, where the first argument is the list and the second is the number of elements to retrieve.

import random

l = [0, 1, 2, 3, 4]

print(random.sample(l, 3))
# [3, 1, 0]

print(type(random.sample(l, 3)))
# <class 'list'>

If the second argument is 1, the result is a list containing one element. If it is 0, an empty list is returned. If the value exceeds the number of elements in the list, an error is raised.

print(random.sample(l, 1))
# [1]

print(random.sample(l, 0))
# []

# print(random.sample(l, 10))
# ValueError: Sample larger than population or is negative

Even if you pass a tuple or a string as the first argument, the result is still a list.

print(random.sample(('xxx', 'yyy', 'zzz'), 2))
# ['zzz', 'xxx']

print(random.sample('abcde', 2))
# ['c', 'd']

Use tuple() or join() to convert a list into a tuple or a string, respectively.

print(tuple(random.sample(('xxx', 'yyy', 'zzz'), 2)))
# ('zzz', 'yyy')

print(''.join(random.sample('abcde', 2)))
# be

Note that if the original list or tuple contains duplicate elements, the same values may be selected.

l_dup = [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3]

print(random.sample(l_dup, 3))
# [2, 0, 0]

If you want to avoid duplicate values, use set() to convert lists and tuples to sets, extract unique elements, and then use sample().

print(set(l_dup))
# {0, 1, 2, 3}

print(random.sample(list(set(l_dup)), 3))
# [0, 2, 1]

Starting from Python 3.11, passing a set directly to sample() raises a TypeError. To use a set, convert it to a list first.

Random Sample with Replacement: random.choices()

random.choices() randomly samples multiple elements from a list with replacement.

You can specify the number of elements to sample using the k argument. Since elements are chosen with replacement, k can be larger than the number of elements in the original list.

Since k is a keyword-only argument, you must specify it using k=3.

import random

l = [0, 1, 2, 3, 4]

print(random.choices(l, k=3))
# [2, 1, 0]

print(random.choices(l, k=10))
# [3, 4, 1, 4, 4, 2, 0, 4, 2, 0]

If k is omitted, it defaults to 1, and a list containing one element is returned.

print(random.choices(l))
# [1]

You can specify the weight (probability) for each element with the weights argument. The weights can be either integers or floats. If a weight is set to 0, the corresponding element is not selected.

print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1]))
# [0, 2, 3]

print(random.choices(l, k=3, weights=[1, 1, 0, 0, 0]))
# [0, 1, 1]

Cumulative weights can be specified with the cum_weights argument. cum_weights in the following code is equivalent to the former weights in the above code.

print(random.choices(l, k=3, cum_weights=[1, 2, 3, 13, 14]))
# [3, 2, 3]

By default, both weights and cum_weights are set to None, so each element is selected with the same probability.

An error is raised if the length (number of elements) of weights or cum_weights doesn't match that of the original list.

# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1, 1, 1]))
# ValueError: The number of weights does not match the population_

Also, an error is raised if you specify weights and cum_weights simultaneously.

# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1], cum_weights=[1, 2, 3, 13, 14]))
# TypeError: Cannot specify both weights and cumulative weights

Like the previous functions, random.choices() works not only with lists but also with tuples and strings.

Fix the Random Seed: random.seed()

You can fix the random seed and initialize the random number generator with random.seed().

Setting the same seed ensures that the same elements are selected in the same order every time.

random.seed(0)
print(random.choice(l))
# 3

random.seed(0)
print(random.choice(l))
# 3

Related Categories

Related Articles