note.nkmk.me

pandas: Rename columns / index names (labels) of DataFrame

Posted: 2019-07-12 / Modified: 2021-04-06 / Tags: Python, pandas

You can rename (change) column / index names (labels) of pandas.DataFrame by using rename(), add_prefix(), add_suffix(), set_axis() or updating the columns / index attributes.

The same methods can be used to rename the label (index) of pandas.Series.

This article describes the following contents with sample code.

  • Rename column / index name (label): rename()
    • Change multiple names (labels)
    • Update the original object: inplace
    • Rename with functions or lambda expressions
  • Add prefix / suffix to columns: add_prefix(), add_suffix()
  • Rename all names (labels)
    • set_axis()
    • Update the columns / index attributes of pandas.DataFrame
  • For pandas.Series
    • rename()
    • add_prefix(), add_suffix()
    • set_axis()
    • Update the index attributes of pandas.Series

set_index() method that sets an existing column as an index is also provided. See the following article for detail.

As an example, create pandas.DataFrame as follows:

import pandas as pd

df = pd.DataFrame({'A': [11, 21, 31],
                   'B': [12, 22, 32],
                   'C': [13, 23, 33]},
                  index=['ONE', 'TWO', 'THREE'])

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33
Sponsored Link

Rename column / index name (label)): rename()

You can use the rename() method of pandas.DataFrame to change column / index name individually.

Specify the original name and the new name in dict like {original name: new name} to columns / index argument of rename().

columns is for the columns name and index is for index name. If you want to change either, you can only specify one of columns or index.

A new DataFrame is returned, the original DataFrame is not changed.

df_new = df.rename(columns={'A': 'Col_1'}, index={'ONE': 'Row_1'})
print(df_new)
#        Col_1   B   C
# Row_1     11  12  13
# TWO       21  22  23
# THREE     31  32  33

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Change multiple names (labels)

You can change multiple names at once by adding elements to dict.

print(df.rename(columns={'A': 'Col_1', 'C': 'Col_3'}))
#        Col_1   B  Col_3
# ONE       11  12     13
# TWO       21  22     23
# THREE     31  32     33

Update the original object: inplace

By default the original DataFrame is not changed, and a new DataFrame is returned.

Setting the parameter inplace to True changes the original DataFrame. In this case, no new DataFrame is returned, and the return value is None.

df_copy = df.copy()
df_copy.rename(columns={'A': 'Col_1'}, index={'ONE': 'Row_1'}, inplace=True)
print(df_copy)
#        Col_1   B   C
# Row_1     11  12  13
# TWO       21  22  23
# THREE     31  32  33

Rename with functions or lambda expressions

Functions (callable objects) can also be specified in the parameter index and columns of the rename() method.

Applying a function to convert upper and lower case:

print(df.rename(columns=str.lower, index=str.title))
#         a   b   c
# One    11  12  13
# Two    21  22  23
# Three  31  32  33

It is also possible to apply lambda expressions.

print(df.rename(columns=lambda s: s*3, index=lambda s: s + '!!'))
#          AAA  BBB  CCC
# ONE!!     11   12   13
# TWO!!     21   22   23
# THREE!!   31   32   33

Add prefix / suffix to columns: add_prefix(), add_suffix()

add_prefix() and add_suffix() that add prefixes and suffixes to columns names are provided.

The strings specified in the argument is added to the beginning or the end of columns names.

print(df.add_prefix('X_'))
#        X_A  X_B  X_C
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

print(df.add_suffix('_X'))
#        A_X  B_X  C_X
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

add_prefix() and add_suffix() only rename columns. If you want to add prefixes or suffixes to index, specify the lambda expression in the argument index with the rename() method as described above.

Also, add_prefix() and add_suffix() do not have inplace. If you want to update the original object, overwrite it like df = df.add_prefix().

Sponsored Link

Rename all names (labels)

To change all names, use the set_axis() method or update columns / index attributes.

set_axis()

You can change all column / index names by set_axis() method of pandas.DataFrame.

Specify new column / index names as the first parameter labels in a list-like object such as list or tuple.

Setting the parameter axis to 0 or 'index' updates index, and setting it to 1 or columns updates columns. If omitted, index will be updated.

print(df.set_axis(['Row_1', 'Row_2', 'Row_3'], axis=0))
#         A   B   C
# Row_1  11  12  13
# Row_2  21  22  23
# Row_3  31  32  33

print(df.set_axis(['Row_1', 'Row_2', 'Row_3'], axis='index'))
#         A   B   C
# Row_1  11  12  13
# Row_2  21  22  23
# Row_3  31  32  33

print(df.set_axis(['Col_1', 'Col_2', 'Col_3'], axis=1))
#        Col_1  Col_2  Col_3
# ONE       11     12     13
# TWO       21     22     23
# THREE     31     32     33

print(df.set_axis(['Col_1', 'Col_2', 'Col_3'], axis='columns'))
#        Col_1  Col_2  Col_3
# ONE       11     12     13
# TWO       21     22     23
# THREE     31     32     33

print(df.set_axis(['Row_1', 'Row_2', 'Row_3']))
#         A   B   C
# Row_1  11  12  13
# Row_2  21  22  23
# Row_3  31  32  33

Note that an error raises if the size (number of elements) of the list specified in the first parameter does not match the number of rows and columns.

# print(df.set_axis(['Row_1', 'Row_2', 'Row_3', 'Row_4']))
# ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements

By default the original DataFrame is not changed, and a new DataFrame is returned. Setting the parameter inplace to True changes the original DataFrame.

df_copy = df.copy()
df_copy.set_axis(['Row_1', 'Row_2', 'Row_3'], inplace=True)
print(df_copy)
#         A   B   C
# Row_1  11  12  13
# Row_2  21  22  23
# Row_3  31  32  33

Update the columns / index attributes of pandas.DataFrame

You can also directly update the columns and index attributes of pandas.DataFrame.

Lists and tuples can be assigned to the columns and index attributes.

df.index = ['Row_1', 'Row_2', 'Row_3']
df.columns = ['Col_1', 'Col_2', 'Col_3']
print(df)
#        Col_1  Col_2  Col_3
# Row_1     11     12     13
# Row_2     21     22     23
# Row_3     31     32     33

Note that an error raises if the size (number of elements) of the list does not match the number of rows and columns.

# df.index = ['Row_1', 'Row_2', 'Row_3', 'Row_4']
# ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements

For pandas.Series

You can change the label name (index) of pandas.Series as shown in previous examples of pandas.DataFrame.

As an example, create pandas.Series as follows:

s = pd.Series([1, 2, 3], index=['ONE', 'TWO', 'THREE'])
print(s)
# ONE      1
# TWO      2
# THREE    3
# dtype: int64

pandas.Series.rename()

print(s.rename({'ONE': 'a', 'THREE': 'c'}))
# a      1
# TWO    2
# c      3
# dtype: int64

print(s.rename(str.lower))
# one      1
# two      2
# three    3
# dtype: int64

pandas.Series.add_prefix(), pandas.Series.add_suffix()

print(s.add_prefix('X_'))
# X_ONE      1
# X_TWO      2
# X_THREE    3
# dtype: int64

print(s.add_suffix('_X'))
# ONE_X      1
# TWO_X      2
# THREE_X    3
# dtype: int64

pandas.Series.set_axis()

print(s.set_axis(['a', 'b', 'c']))
# a    1
# b    2
# c    3
# dtype: int64

Update the index attributes of pandas.Series

s.index = ['a', 'b', 'c']
print(s)
# a    1
# b    2
# c    3
# dtype: int64
Sponsored Link
Share

Related Categories

Related Articles