pandas: Reset index of DataFrame/Series with reset_index()

Modified: | Tags: Python, pandas

In pandas, the reset_index() method allows you to reset the index of a DataFrame or Series to a 0-based sequence.

For methods to rename the index, refer to the following article.

The pandas version used in this article is as follows. Note that functionality may vary between versions.

import pandas as pd

print(pd.__version__)
# 2.1.4

How to use reset_index()

Consider the following DataFrame.

df = pd.read_csv('data/src/sample_pandas_normal.csv').sort_values('state')
print(df)
#       name  age state  point
# 1      Bob   42    CA     92
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88
# 0    Alice   24    NY     64
# 5    Frank   30    NY     57
# 3     Dave   68    TX     70

For explanation purposes, rows are sorted using sort_values().

The example uses a DataFrame, but reset_index() is also available on Series. The usage, including arguments, is the same for both.

Basic usage

By default, reset_index() adds the original index as a new data column and creates a new sequential index.

print(df.reset_index())
#    index     name  age state  point
# 0      1      Bob   42    CA     92
# 1      2  Charlie   18    CA     70
# 2      4    Ellen   24    CA     88
# 3      0    Alice   24    NY     64
# 4      5    Frank   30    NY     57
# 5      3     Dave   68    TX     70

Remove the original index: drop

Setting drop=True removes the original index.

print(df.reset_index(drop=True))
#       name  age state  point
# 0      Bob   42    CA     92
# 1  Charlie   18    CA     70
# 2    Ellen   24    CA     88
# 3    Alice   24    NY     64
# 4    Frank   30    NY     57
# 5     Dave   68    TX     70

Change original object: inplace

By default, reset_index() returns a new object without changing the original. Setting inplace=True modifies the original object.

df.reset_index(inplace=True, drop=True)
print(df)
#       name  age state  point
# 0      Bob   42    CA     92
# 1  Charlie   18    CA     70
# 2    Ellen   24    CA     88
# 3    Alice   24    NY     64
# 4    Frank   30    NY     57
# 5     Dave   68    TX     70

Reset the index to another column

Consider the following DataFrame where row names are set as the index instead of numbers.

df_name = pd.read_csv('data/src/sample_pandas_normal.csv', index_col=0)
print(df_name)
#          age state  point
# name                     
# Alice     24    NY     64
# Bob       42    CA     92
# Charlie   18    CA     70
# Dave      68    TX     70
# Ellen     24    CA     88
# Frank     30    NY     57

Using reset_index() sets sequential numbers as the index, and the original index is added as a data column.

print(df_name.reset_index())
#       name  age state  point
# 0    Alice   24    NY     64
# 1      Bob   42    CA     92
# 2  Charlie   18    CA     70
# 3     Dave   68    TX     70
# 4    Ellen   24    CA     88
# 5    Frank   30    NY     57

To change the index to a different column, use set_index().

Calling set_index() on the original DataFrame will delete the original index.

print(df_name.set_index('state'))
#        age  point
# state            
# NY      24     64
# CA      42     92
# CA      18     70
# TX      68     70
# CA      24     88
# NY      30     57

To keep the original index as a data column, use reset_index() followed by set_index().

print(df_name.reset_index().set_index('state'))
#           name  age  point
# state                     
# NY       Alice   24     64
# CA         Bob   42     92
# CA     Charlie   18     70
# TX        Dave   68     70
# CA       Ellen   24     88
# NY       Frank   30     57

For the sake of illustration, this example sets a column with duplicate values as the index, but it is generally easier to select data when index values are unique.

reset_index() with MultiIndex

In pandas, you can set a MultiIndex, which enables hierarchical, multi-level indexing.

df_mi = pd.read_csv('data/src/sample_pandas_normal.csv', index_col=(2, 0))
print(df_mi)
#                age  point
# state name               
# NY    Alice     24     64
# CA    Bob       42     92
#       Charlie   18     70
# TX    Dave      68     70
# CA    Ellen     24     88
# NY    Frank     30     57

By default, reset_index() resets all indexes in the case of a MultiIndex.

print(df_mi.reset_index())
#   state     name  age  point
# 0    NY    Alice   24     64
# 1    CA      Bob   42     92
# 2    CA  Charlie   18     70
# 3    TX     Dave   68     70
# 4    CA    Ellen   24     88
# 5    NY    Frank   30     57

You can specify which index to reset in the level argument.

print(df_mi.reset_index(level='state'))
#         state  age  point
# name                     
# Alice      NY   24     64
# Bob        CA   42     92
# Charlie    CA   18     70
# Dave       TX   68     70
# Ellen      CA   24     88
# Frank      NY   30     57

Additionally, reset_index() includes arguments to control behavior for multi-level columns. For details, refer to the official documentation.

Related Categories

Related Articles