In this tutorial, we will explore two important data selection
methods in Python’s Pandas library: .loc and
.iloc. Understanding when and how to use these methods is
crucial for data manipulation and analysis in Python.
.loc and .iloc?Pandas is an open-source library that provides data structures and
data analysis tools for Python. The .loc and
.iloc methods are used to access a group of rows and
columns in Pandas DataFrame.
.loc is label-based data selecting method which means
that you have to pass the name of the row or column which you want to
select. This method includes the last element of the range passed in it,
unlike iloc..iloc is integer index-based. So here, you have to pass
the integer index in the method to select specific rows/columns. Unlike
loc, it does not include the last element of the range passed in
it.Let’s create a DataFrame and use this for our examples.
# Import Pandas library
import pandas as pd
# Create a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
df
## Name Age City
## 0 Alice 25 New York
## 1 Bob 30 Los Angeles
## 2 Charlie 35 Chicago
## 3 David 40 Houston
## 4 Eve 45 Phoenix
.loc.loc is used for label-based indexing or selecting data
based on labels. Let’s select the data of the person named
‘Charlie’.
# Selecting the data of 'Charlie' using .loc
charlie_data = df.loc[df['Name'] == 'Charlie']
charlie_data
## Name Age City
## 2 Charlie 35 Chicago
We can also select specific columns along with the conditions.
# Select the Age and City of 'Charlie' using .loc
charlie_info = df.loc[df['Name'] == 'Charlie', ['Age', 'City']]
charlie_info
## Age City
## 2 35 Chicago
.iloc.iloc is used for integer-based indexing. It is used to
select elements by their position. Let’s select the
data of the person who is in the third index position (fourth person in
the DataFrame).
# Selecting the data of the person at index 3 using .iloc
person_at_index_3 = df.iloc[3]
person_at_index_3
## Name David
## Age 40
## City Houston
## Name: 3, dtype: object
We can also select specific columns by passing integers.
# Select the Name and City of the person at index 3 using .iloc
person_info = df.iloc[3, [0, 2]]
person_info
## Name David
## City Houston
## Name: 3, dtype: object
.loc for label-based indexing..iloc for integer-based indexing..loc includes the last element in the range, while
.iloc does not.