note.nkmk.me

pandas: Rename columns / index names (labels) of DataFrame

Posted: 2019-07-12 / Modified: 2020-09-24 / Tags: Python, pandas

You can rename (change) column / index names (labels) of pandas.DataFrame by using rename(), add_prefix() and add_suffix() or updating the columns / index attributes.

The same methods can be used to rename the label (index) of pandas.Series.

This article describes the following contents with sample code.

  • Rename column / index: rename()
    • Change multiple names (labels)
    • Change the original object: inplace
    • Rename with functions or lambda expressions
  • Add prefix / suffix to columns: add_prefix(), add_suffix()
  • Update the columns / index attributes of pandas.DataFrame
    • Replace all column / index names (labels)
  • For pandas.Series
    • rename()
    • add_prefix(), add_suffix()
    • Update the index attributes of pandas.Series

set_index() method that sets an existing column as an index is also provided. See the following article for detail.

As an example, create pandas.DataFrame as follows:

import pandas as pd

df = pd.DataFrame({'A': [11, 21, 31],
                   'B': [12, 22, 32],
                   'C': [13, 23, 33]},
                  index=['ONE', 'TWO', 'THREE'])

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33
Sponsored Link

Rename column / index: rename()

You can use the rename() method of pandas.DataFrame to change column / index name individually.

Specify the original name and the new name in dict like {original name: new name} to columns / index argument of rename().

columns is for the columns name and index is for index name. If you want to change either, you need only specify one of columns or index.

A new DataFrame is returned, the original DataFrame is not changed.

df_new = df.rename(columns={'A': 'a'}, index={'ONE': 'one'})
print(df_new)
#         a   B   C
# one    11  12  13
# TWO    21  22  23
# THREE  31  32  33

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Change multiple names (labels)

Multiple index / columns names changed at once by adding elements to dict.

print(df.rename(columns={'A': 'a', 'C': 'c'}))
#         a   B   c
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Change the original object: inplace

By default the original DataFrame is not changed, and a new DataFrame is returned.

Setting the parameter inplace to True changes the original DataFrame. In this case, no new DataFrame is returned, and the return value is None.

df_org = df.copy()
df_org.rename(columns={'A': 'a'}, index={'ONE': 'one'}, inplace=True)
print(df_org)
#         a   B   C
# one    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Rename with functions or lambda expressions

Functions (callable objects) can also be specified in the parameter index and columns of the rename() method.

Applying a function to convert upper and lower case:

print(df.rename(columns=str.lower, index=str.title))
#         a   b   c
# One    11  12  13
# Two    21  22  23
# Three  31  32  33

It is also possible to apply lambda expressions.

print(df.rename(columns=lambda s: s*3, index=lambda s: s + '!!'))
#          AAA  BBB  CCC
# ONE!!     11   12   13
# TWO!!     21   22   23
# THREE!!   31   32   33

Add prefix / suffix to columns: add_prefix(), add_suffix()

add_prefix() and add_suffix() that add prefixes and suffixes to columns names are provided.

The strings specified in the argument is added to the beginning or the end of columns names.

print(df.add_prefix('X_'))
#        X_A  X_B  X_C
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

print(df.add_suffix('_X'))
#        A_X  B_X  C_X
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

add_prefix() and add_suffix() only rename columns. If you want to add prefixes or suffixes to index, specify the lambda expression in the argument index with the rename() method as described above.

Also, add_prefix() and add_suffix() do not have inplace. If you want to update the original object, overwrite it like df = df.add_prefix().

Sponsored Link

Update the columns / index attributes of pandas.DataFrame

Replace all column / index names (labels)

If you want to change all column and index names, it is easier to update the columns and index attributes of pandas.DataFrame rather than using the rename() method.

Lists and tuples can be assigned to the columns and index attributes.

df.index = [1, 2, 3]
df.columns = ['a', 'b', 'c']

print(df)
#     a   b   c
# 1  11  12  13
# 2  21  22  23
# 3  31  32  33

Note that an error raises if the size of the list (number of elements) does not match the number of columns / index.

# df.index = [1, 2, 3, 4]
# ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements

For pandas.Series

You can change the label name (index) of pandas.Series as shown in previous examples of pandas.DataFrame.

As an example, create pandas.Series as follows:

s = pd.Series([1, 2, 3], index=['ONE', 'TWO', 'THREE'])
print(s)
# ONE      1
# TWO      2
# THREE    3
# dtype: int64

pandas.Series.rename()

print(s.rename({'ONE': 'a', 'THREE': 'c'}))
# a      1
# TWO    2
# c      3
# dtype: int64

print(s.rename(str.lower))
# one      1
# two      2
# three    3
# dtype: int64

pandas.Series.add_prefix(), pandas.Series.add_suffix()

print(s.add_prefix('X_'))
# X_ONE      1
# X_TWO      2
# X_THREE    3
# dtype: int64

print(s.add_suffix('_X'))
# ONE_X      1
# TWO_X      2
# THREE_X    3
# dtype: int64

Update the index attributes of pandas.Series

s.index = ['a', 'b', 'c']
print(s)
# a    1
# b    2
# c    3
# dtype: int64
Sponsored Link
Share

Related Categories

Related Articles