note.nkmk.me

pandas: Rename index / columns names (labels) of DataFrame

Posted: 2019-07-12 / Modified: 2020-09-24 / Tags: Python, pandas

You can change index / columns names (labels) of pandas.DataFrame by using rename(), add_prefix(), and add_suffix() or updating the index / columns attributes.

The same methods can be used to rename the label of pandas.Series.

This post describes the following contents with sample code.

  • pandas.DataFrame.rename()
    • Change multiple names (labels)
    • Change the original object: inplace
    • Rename with functions or lambda expressions
  • pandas.DataFrame.add_prefix(), pandas.DataFrame.add_suffix()
  • Update the index / columns attributes of pandas.DataFrame
    • Replace all index / columns names (labels)
  • For pandas.Series

set_index() method that sets an existing column as an index is also provided. See the following post for detail.

As an example, create pandas.DataFrame as follows:

import pandas as pd

df = pd.DataFrame({'A': [11, 21, 31],
                   'B': [12, 22, 32],
                   'C': [13, 23, 33]},
                  index=['ONE', 'TWO', 'THREE'])

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33
Sponsored Link

pandas.DataFrame.rename()

You can use the rename() method of pandas.DataFrame to change any row / column name individually.

Specify the original name and the new name in dict like {original name: new name} to index / columns of rename().

index is for index name and columns is for the columns name. If you want to change either, you need only specify one of index or columns.

A new DataFrame is returned, the original DataFrame is not changed.

df_new = df.rename(columns={'A': 'a'}, index={'ONE': 'one'})
print(df_new)
#         a   B   C
# one    11  12  13
# TWO    21  22  23
# THREE  31  32  33

print(df)
#         A   B   C
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Change multiple names (labels)

Multiple index / columns names changed at once by adding elements to dict.

print(df.rename(columns={'A': 'a', 'C': 'c'}))
#         a   B   c
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Change the original object: inplace

By default the original DataFrame is not changed, and a new DataFrame is returned.

Setting the parameter inplace to True changes the original DataFrame. In this case, no new DataFrame is returned, and the return value is None.

df_org = df.copy()
df_org.rename(columns={'A': 'a'}, index={'ONE': 'one'}, inplace=True)
print(df_org)
#         a   B   C
# one    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Rename with functions or lambda expressions

Functions (callable objects) can also be specified in the parameter index and columns of the rename() method.

Applying a function to convert upper and lower case:

print(df.rename(columns=str.lower, index=str.title))
#         a   b   c
# One    11  12  13
# Two    21  22  23
# Three  31  32  33

It is also possible to apply lambda expressions.

print(df.rename(columns=lambda s: s*3, index=lambda s: s + '!!'))
#          AAA  BBB  CCC
# ONE!!     11   12   13
# TWO!!     21   22   23
# THREE!!   31   32   33

pandas.DataFrame.add_prefix(), pandas.DataFrame.add_suffix()

Methods for adding prefixes and suffixes to columns names, add_prefix() and add_suffix() are provided.

The strings specified in the argument is added to the beginning or the end of columns names.

print(df.add_prefix('X_'))
#        X_A  X_B  X_C
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

print(df.add_suffix('_X'))
#        A_X  B_X  C_X
# ONE     11   12   13
# TWO     21   22   23
# THREE   31   32   33

add_prefix() and add_suffix() only process columns. If you want to add prefixes or suffixes to index, specify the lambda expression in the argument index with the rename() method as described above.

Also, add_prefix() and add_suffix() do not have inplace. If you want to update the original object, overwrite it like df = df.add_prefix().

Sponsored Link

Update the index / columns attributes of pandas.DataFrame

Replace all index / columns names (labels)

If you want to change all row and column names to new names, it is easier to update the index and columns attributes of pandas.DataFrame rather than using the rename() method.

Lists and tuples can be assigned to the index and columns attributes.

df.index = [1, 2, 3]
df.columns = ['a', 'b', 'c']

print(df)
#     a   b   c
# 1  11  12  13
# 2  21  22  23
# 3  31  32  33

Note that an error will occur if the size of the list (number of elements) does not match the number of rows / columns.

# df.index = [1, 2, 3, 4]
# ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements

For pandas.Series

You can change the label name (index) of pandas.Series as shown in previous examples of pandas.DataFrame.

As an example, create pandas.Series as follows:

s = pd.Series([1, 2, 3], index=['ONE', 'TWO', 'THREE'])
print(s)
# ONE      1
# TWO      2
# THREE    3
# dtype: int64

pandas.Series.rename()

print(s.rename({'ONE': 'a', 'THREE': 'c'}))
# a      1
# TWO    2
# c      3
# dtype: int64

print(s.rename(str.lower))
# one      1
# two      2
# three    3
# dtype: int64

pandas.Series.add_prefix(), pandas.Series.add_suffix()

print(s.add_prefix('X_'))
# X_ONE      1
# X_TWO      2
# X_THREE    3
# dtype: int64

print(s.add_suffix('_X'))
# ONE_X      1
# TWO_X      2
# THREE_X    3
# dtype: int64

Update the index attributes of pandas.Series

s.index = ['a', 'b', 'c']
print(s)
# a    1
# b    2
# c    3
# dtype: int64
Sponsored Link
Share

Related Categories

Related Posts