Introduction

In this tutorial, we will explore two important data selection methods in Python’s Pandas library: .loc and .iloc. Understanding when and how to use these methods is crucial for data manipulation and analysis in Python.

What are .loc and .iloc?

Pandas is an open-source library that provides data structures and data analysis tools for Python. The .loc and .iloc methods are used to access a group of rows and columns in Pandas DataFrame.

Let’s create a DataFrame and use this for our examples.

# Import Pandas library
import pandas as pd

# Create a sample DataFrame
data = {
  'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
  'Age': [25, 30, 35, 40, 45],
  'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
df
##       Name  Age         City
## 0    Alice   25     New York
## 1      Bob   30  Los Angeles
## 2  Charlie   35      Chicago
## 3    David   40      Houston
## 4      Eve   45      Phoenix

Using .loc

.loc is used for label-based indexing or selecting data based on labels. Let’s select the data of the person named ‘Charlie’.

# Selecting the data of 'Charlie' using .loc
charlie_data = df.loc[df['Name'] == 'Charlie']
charlie_data
##       Name  Age     City
## 2  Charlie   35  Chicago

We can also select specific columns along with the conditions.

# Select the Age and City of 'Charlie' using .loc
charlie_info = df.loc[df['Name'] == 'Charlie', ['Age', 'City']]
charlie_info
##    Age     City
## 2   35  Chicago

Using .iloc

.iloc is used for integer-based indexing. It is used to select elements by their position. Let’s select the data of the person who is in the third index position (fourth person in the DataFrame).

# Selecting the data of the person at index 3 using .iloc
person_at_index_3 = df.iloc[3]
person_at_index_3
## Name      David
## Age          40
## City    Houston
## Name: 3, dtype: object

We can also select specific columns by passing integers.

# Select the Name and City of the person at index 3 using .iloc
person_info = df.iloc[3, [0, 2]]
person_info
## Name      David
## City    Houston
## Name: 3, dtype: object

Summary