Convert between pandas DataFrame/Series and Python list

Modified: 2024-01-24 | Tags: Python, pandas, List

This article explains how to convert between pandas DataFrame/Series and Python built-in lists.

Contents

Convert lists to DataFrame and Series
- Convert lists to DataFrame and Series using pd.DataFrame() and pd.Series()
- For lists containing labels
Convert DataFrame and Series to lists

Although the term "convert" is used for simplicity, the process actually involves creating a new object of a different type, while the original object remains unchanged.

For conversions between DataFrame/Series and NumPy arrays (ndarray), as well as between DataFrame and Series, refer to the following articles.

The pandas version used in this article is as follows. Note that functionality may vary between versions.

import pandas as pd

print(pd.__version__)
# 2.1.4

source: pandas_list.py

Convert lists to `DataFrame` and `Series`

Convert lists to `DataFrame` and `Series` using `pd.DataFrame()` and `pd.Series()`

By specifying a list as the first argument in the pd.Series() or pd.DataFrame() constructors, a Series or DataFrame is generated from the list.

l_1d = [0, 10, 20]

print(pd.Series(l_1d))
# 0     0
# 1    10
# 2    20
# dtype: int64

l_2d = [[0, 10, 20], [30, 40, 50]]

print(pd.DataFrame(l_2d))
#     0   1   2
# 0   0  10  20
# 1  30  40  50

source: pandas_list.py

Specifying a one-dimensional list directly in pd.DataFrame() creates a single-column DataFrame. When specified as [one_dimensional_list], it creates a single-row DataFrame.

print(pd.DataFrame(l_1d))
#     0
# 0   0
# 1  10
# 2  20

print(pd.DataFrame([l_1d]))
#    0   1   2
# 0  0  10  20

source: pandas_list.py

You can transpose a two-dimensional list (list of lists).

Transpose 2D list in Python (swap rows and columns)

print(pd.DataFrame(zip(*l_2d)))
#     0   1
# 0   0  30
# 1  10  40
# 2  20  50

source: pandas_list.py

Specify row and column names: `index`, `columns`

Row names can be specified with the index argument, and column names with the columns argument.

print(pd.Series(l_1d, index=['X', 'Y', 'Z']))
# X     0
# Y    10
# Z    20
# dtype: int64

print(pd.DataFrame(l_2d, index=['X', 'Y'], columns=['A', 'B', 'C']))
#     A   B   C
# X   0  10  20
# Y  30  40  50

source: pandas_list.py

It is also possible to set or change the index and columns after creating a Series or a DataFrame.

pandas: Rename column/index names of DataFrame

Specify data type: `dtype`

The data type (dtype) of each column in a DataFrame, as well as that of a Series, is automatically determined based on the values in the list.

For example, if a column contains a mix of integers (int) and floating-point numbers (float), the data type of the column becomes float, and if it contains a mix of numbers and strings, the data type becomes object.

l_2d_multi = [[0, 0.0, 'abc', 123, 'abc'], [10, 0.1, 'xyz', 1.23, 100]]

print(pd.DataFrame(l_2d_multi))
#     0    1    2       3    4
# 0   0  0.0  abc  123.00  abc
# 1  10  0.1  xyz    1.23  100

print(pd.DataFrame(l_2d_multi).dtypes)
# 0      int64
# 1    float64
# 2     object
# 3    float64
# 4     object
# dtype: object

source: pandas_list.py

It is also possible to specify the data type using the dtype argument of pd.DataFrame() or pd.Series().

print(pd.DataFrame(l_2d, dtype=float))
#       0     1     2
# 0   0.0  10.0  20.0
# 1  30.0  40.0  50.0

source: pandas_list.py

For more details on data types (dtype) in pandas, refer to the following article.

pandas: How to use astype() to cast dtype of DataFrame

For lists containing labels

To create a Series from a list of label-value pairs, first decompose the list into labels and values, and then pass these to pd.Series().

Transpose 2D list in Python (swap rows and columns)

l_1d_index = [['X', 0], ['Y', 1], ['Z', 2]]

index, values = zip(*l_1d_index)
print(index)
# ('X', 'Y', 'Z')

print(values)
# (0, 1, 2)

print(pd.Series(values, index=index))
# X    0
# Y    1
# Z    2
# dtype: int64

source: pandas_list.py

To create a DataFrame from a list that includes labels and multiple values, first load the entire list into the DataFrame, and then set the index using the set_index() method.

pandas: Assign existing column to the DataFrame index with set_index()

l_2d_index = [['X', 0, 0.0], ['Y', 1, 0.1], ['Z', 2, 0.2]]

df_index = pd.DataFrame(l_2d_index, columns=['idx', 'A', 'B'])
print(df_index)
#   idx  A    B
# 0   X  0  0.0
# 1   Y  1  0.1
# 2   Z  2  0.2

print(df_index.set_index('idx'))
#      A    B
# idx        
# X    0  0.0
# Y    1  0.1
# Z    2  0.2

source: pandas_list.py

If the original list also includes column names, use the first row for the columns argument and the rest of the rows (obtained by slicing) as the first argument.

How to slice a list, string, tuple in Python

l_2d_index_columns = [['idx', 'A', 'B'], ['X', 0, 0.0], ['Y', 1, 0.1], ['Z', 2, 0.2]]

df_index_columns = pd.DataFrame(l_2d_index_columns[1:], columns=l_2d_index_columns[0])
print(df_index_columns)
#   idx  A    B
# 0   X  0  0.0
# 1   Y  1  0.1
# 2   Z  2  0.2

print(df_index_columns.set_index('idx'))
#      A    B
# idx        
# X    0  0.0
# Y    1  0.1
# Z    2  0.2

source: pandas_list.py

Convert `DataFrame` and `Series` to lists

Convert `Series` to a list using `tolist()` or `to_list()`

Series can be converted to a list using the tolist() or to_list() methods.

s = pd.Series([0, 10, 20])
print(s)
# 0     0
# 1    10
# 2    20
# dtype: int64

print(s.tolist())
# [0, 10, 20]

print(s.to_list())
# [0, 10, 20]

source: pandas_list.py

Convert `DataFrame` to a list using `values` and `tolist()`

As of pandas version 2.1.4, DataFrame does not have the tolist() or to_list() methods. To convert a DataFrame to a list, first convert it into a NumPy array (ndarray) using the values attribute, and then use the tolist() method of ndarray.

df = pd.DataFrame([[0, 10, 20], [30, 40, 50]])
print(df)
#     0   1   2
# 0   0  10  20
# 1  30  40  50

print(df.values.tolist())
# [[0, 10, 20], [30, 40, 50]]

source: pandas_list.py

Convert `Series` and `DataFrame` to lists including `index` and `columns`

To keep the index as part of the list, use the reset_index() method to reset the index and turn it into a data column.

pandas: Reset index of DataFrame, Series with reset_index()

s_index = pd.Series([0, 1, 2], index=['X', 'Y', 'Z'])
print(s_index)
# X    0
# Y    1
# Z    2
# dtype: int64

print(s_index.reset_index())
#   index  0
# 0     X  0
# 1     Y  1
# 2     Z  2

print(s_index.reset_index().values.tolist())
# [['X', 0], ['Y', 1], ['Z', 2]]

source: pandas_list.py

df_index = pd.DataFrame([[0, 1, 2], [3, 4, 5]], index=['A', 'B'], columns=['X', 'Y', 'Z'])
print(df_index)
#    X  Y  Z
# A  0  1  2
# B  3  4  5

print(df_index.reset_index())
#   index  X  Y  Z
# 0     A  0  1  2
# 1     B  3  4  5

print(df_index.reset_index().values.tolist())
# [['A', 0, 1, 2], ['B', 3, 4, 5]]

source: pandas_list.py

As of version 2.1.4, DataFrame has no method to reset columns. To include both index and columns in the list, first apply reset_index(), then transpose using .T, apply reset_index() again, and finally revert the transposition with .T. A more efficient method may exist.

pandas: Transpose DataFrame (swap rows and columns)

print(df_index.reset_index().T.reset_index().T.values.tolist())
# [['index', 'X', 'Y', 'Z'], ['A', 0, 1, 2], ['B', 3, 4, 5]]

source: pandas_list.py

Convert `index` and `columns` to lists

The index attribute of Series, as well as the index and columns attributes of DataFrame, are all of type Index. They can be converted to lists using the tolist() or to_list() methods.

s_index = pd.Series([0, 1, 2], index=['X', 'Y', 'Z'])
print(s_index)
# X    0
# Y    1
# Z    2
# dtype: int64

print(s_index.index)
# Index(['X', 'Y', 'Z'], dtype='object')

print(s_index.index.tolist())
# ['X', 'Y', 'Z']

source: pandas_list.py

df_index = pd.DataFrame([[0, 1, 2], [3, 4, 5]], index=['A', 'B'], columns=['X', 'Y', 'Z'])
print(df_index)
#    X  Y  Z
# A  0  1  2
# B  3  4  5

print(df_index.index)
# Index(['A', 'B'], dtype='object')

print(df_index.index.tolist())
# ['A', 'B']

print(df_index.columns)
# Index(['X', 'Y', 'Z'], dtype='object')

print(df_index.columns.tolist())
# ['X', 'Y', 'Z']

source: pandas_list.py

Note that an Index allows direct iteration in a for loop to extract elements and supports using [] for specific index-based retrieval. Although slicing is possible, modifying elements directly within an Index is not. Thus, conversion to a list is unnecessary if you only need to access elements.

for i in df_index.columns:
    print(i, type(i))
# X <class 'str'>
# Y <class 'str'>
# Z <class 'str'>

print(df_index.columns[0])
# X

print(df_index.columns[:2])
# Index(['X', 'Y'], dtype='object')

# df_index.columns[0] = 'x'
# TypeError: Index does not support mutable operations

source: pandas_list.py

Related Categories

Related Articles