pandas: Get first/last n rows of DataFrame with head() and tail()

Modified: | Tags: Python, pandas

In pandas, the head() and tail() methods are used to get the first and last n rows of a DataFrame, as well as the first and last n elements of a Series.

Another method useful for examining data in large DataFrame or Series is sample(), which randomly samples rows or columns.

The pandas version used in this article is as follows. Note that functionality may vary between versions. The following DataFrame with 10 rows is used as an example.

import pandas as pd

print(pd.__version__)
# 2.1.4

df = pd.DataFrame({'col_0': list('ABCDEFGHIJ'), 'col_1': range(9, -1, -1)},
                  index=[f'row_{i}' for i in range(10)])
print(df)
#       col_0  col_1
# row_0     A      9
# row_1     B      8
# row_2     C      7
# row_3     D      6
# row_4     E      5
# row_5     F      4
# row_6     G      3
# row_7     H      2
# row_8     I      1
# row_9     J      0

The following examples use DataFrame, but Series also supports the head() and tail() methods in the same manner.

Get the first n rows: head()

The head() method returns the first n rows.

By default, the first 5 rows are returned.

print(df.head())
#       col_0  col_1
# row_0     A      9
# row_1     B      8
# row_2     C      7
# row_3     D      6
# row_4     E      5

You can specify the number of rows as the first argument, n.

print(df.head(3))
#       col_0  col_1
# row_0     A      9
# row_1     B      8
# row_2     C      7

Get the last n rows: tail()

The tail() method returns the last n rows.

By default, the last 5 rows are returned.

print(df.tail())
#       col_0  col_1
# row_5     F      4
# row_6     G      3
# row_7     H      2
# row_8     I      1
# row_9     J      0

You can specify the number of rows as the first argument, n.

print(df.tail(3))
#       col_0  col_1
# row_7     H      2
# row_8     I      1
# row_9     J      0

Get rows by specifying row numbers: slice

You can get rows at any position by specifying row numbers with slices.

print(df[3:6])
#       col_0  col_1
# row_3     D      6
# row_4     E      5
# row_5     F      4

It is also possible to perform similar operations to head() and tail() using slices.

print(df[:5])
#       col_0  col_1
# row_0     A      9
# row_1     B      8
# row_2     C      7
# row_3     D      6
# row_4     E      5

print(df[-5:])
#       col_0  col_1
# row_5     F      4
# row_6     G      3
# row_7     H      2
# row_8     I      1
# row_9     J      0

Get the first/last row and its values

Passing 1 to head() or tail() returns the first or last row, respectively. However, it is important to note that even a single row is returned as a DataFrame.

print(df.head(1))
#       col_0  col_1
# row_0     A      9

print(type(df.head(1)))
# <class 'pandas.core.frame.DataFrame'>

Use iloc to get a single row as a Series: iloc[0] for the first row and iloc[-1] for the last row. To retrieve a specific value, use iloc[0]['column_name'] or iloc[-1]['column_name'].

print(df.iloc[0])
# col_0    A
# col_1    9
# Name: row_0, dtype: object

print(type(df.iloc[0]))
# <class 'pandas.core.series.Series'>

print(df.iloc[0]['col_0'])
# A
print(df.iloc[-1])
# col_0    J
# col_1    0
# Name: row_9, dtype: object

print(type(df.iloc[-1]))
# <class 'pandas.core.series.Series'>

print(df.iloc[-1]['col_0'])
# J

Note that when assigning values using the above approach, a SettingWithCopyWarning may occur.

df.iloc[0]['col_0'] = 'AAA'
# /var/folders/rf/b7l8_vgj5mdgvghn_326rn_c0000gn/T/ipykernel_48384/183824280.py:1: SettingWithCopyWarning: 
# A value is trying to be set on a copy of a slice from a DataFrame

To avoid the SettingWithCopyWarning, get the first/last row name from the index attribute and specify it in at. loc can also be used, but at is faster for retrieving and assigning a single value.

df.at[df.index[0], 'col_0'] = 'AAA'
df.at[df.index[-1], 'col_0'] = 'JJJ'

print(df)
#       col_0  col_1
# row_0   AAA      9
# row_1     B      8
# row_2     C      7
# row_3     D      6
# row_4     E      5
# row_5     F      4
# row_6     G      3
# row_7     H      2
# row_8     I      1
# row_9   JJJ      0

For more details on at, iat, loc, and iloc, refer to the following article.

Related Categories

Related Articles