note.nkmk.me

Random sampling from a list in Python (random.choice, sample, choices)

Posted: 2020-02-05 / Tags: Python, List

To get random elements from sequence objects such as lists (list), tuples (tuple), strings (str) in Python, use choice(), sample(), choices() of the random module.

choice() returns one random element, and sample() and choices() return a list of multiple random elements. sample() is used for random sampling without replacement, and choices() is used for random sampling with replacement.

This post describes the following contents.

  • Pick a random element: random.choice()
  • Random sampling without replacement: random.sample()
  • Random sampling with replacement: random.choices()
  • Set a seed

If you want to shuffle a list, see the following post.

Sponsored Link

Pick a random element: random.choice()

random.choice() returns one random element from the list.

import random

l = [0, 1, 2, 3, 4]

print(random.choice(l))
# 0

Tuples and strings are processed similarly. In the case of a string, one character is returned.

print(random.choice(('xxx', 'yyy', 'zzz')))
# yyy

print(random.choice('abcde'))
# b

An error raises if the list, tuple, or string is empty.

# print(random.choice([]))
# IndexError: Cannot choose from an empty sequence

Random sampling without replacement: random.sample()

random.sample() returns multiple random elements from the list without replacement.

Pass the list to the first argument and the number of elements you want to get to the second argument. A list is returned.

import random

l = [0, 1, 2, 3, 4]

print(random.sample(l, 3))
# [1, 3, 2]

print(type(random.sample(l, 3)))
# <class 'list'>

If the second argument is set to 1, a list with one element is returned. If set to 0, an empty list is returned. if set to the value that exceeds the number of elements of the list, an error raises.

print(random.sample(l, 1))
# [0]

print(random.sample(l, 0))
# []

# print(random.sample(l, 10))
# ValueError: Sample larger than population or is negative

If the first argument is set to a tuple or a string, the list is returned.

print(random.sample(('xxx', 'yyy', 'zzz'), 2))
# ['xxx', 'yyy']

print(random.sample('abcde', 2))
# ['a', 'e']

To convert to tuples or strings, use tuple(), join().

print(tuple(random.sample(('xxx', 'yyy', 'zzz'), 2)))
# ('yyy', 'xxx')

print(''.join(random.sample('abcde', 2)))
# de
Sponsored Link

Random sampling with replacement: random.choices()

random.choices() returns multiple random elements from the list with replacement.

choices() was added in Python 3.6 and cannot be used in earlier versions.

Specify the number of elements you want to get with the argument k. Since elements are chosen with replacement, k can be larger than the number of elements in the original list.

Since k is a keyword-only argument, it is necessary to use keywords such as k=3.

import random

l = [0, 1, 2, 3, 4]

print(random.choices(l, k=3))
# [2, 1, 0]

print(random.choices(l, k=10))
# [3, 4, 1, 4, 4, 2, 0, 4, 2, 0]

k is set to 1 by default. If omitted, a list with one element is returned.

print(random.choices(l))
# [1]

You can specify the weight (probability) for each element to the weights argument. The type of the list element specified in weights can be either int or float. If set to 0, the element is not selected.

print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1]))
# [0, 2, 3]

print(random.choices(l, k=3, weights=[1, 1, 0, 0, 0]))
# [0, 1, 1]

Cumulative weights can be specified in the argument cum_weights. cum_weights in the following sample code is equivalent to the first weights in the above code.

print(random.choices(l, k=3, cum_weights=[1, 2, 3, 13, 14]))
# [3, 2, 3]

weights and cum_weights both set to None by default, and each element is selected with the same probability.

If the length (number of elements) of weights or cum_weights is different from the original list, an error raises.

# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1, 1, 1]))
# ValueError: The number of weights does not match the population_

Also, an error raises if you specify weights and cum_weights at the same time.

# print(random.choices(l, k=3, weights=[1, 1, 1, 10, 1], cum_weights=[1, 2, 3, 13, 14]))
# TypeError: Cannot specify both weights and cumulative weights

In the sample code so far, a list was specified to the first argument, but the same applies to tuples and strings.

Set a seed

By giving an arbitrary integer to random.seed(), the seed for generating random numbers can be set.

random.seed(0)
print(random.choice(l))
# 3
Sponsored Link
Share

Related Categories

Related Posts