Introduction

One-hot encoding is a method used to convert categorical data into a format that can be provided to machine learning algorithms to do a better job in prediction.

Creating Example Dataframe

Let’s create an example DataFrame with some categorical features.

import pandas as pd

# Creating example data
data = {'Name': ['John', 'Mike', 'Sara'],
        'Gender': ['Male', 'Male', 'Female'],
        'Country': ['USA', 'Canada', 'Australia']}

# Create DataFrame
df = pd.DataFrame(data)

# Display the DataFrame
print(df)
##    Name  Gender    Country
## 0  John    Male        USA
## 1  Mike    Male     Canada
## 2  Sara  Female  Australia

One-Hot Encoding

We will use Pandas to perform one-hot encoding on the ‘Gender’ and ‘Country’ columns.

# Perform one-hot encoding
encoded_df = pd.get_dummies(df, columns=['Gender', 'Country'])

# Display the encoded DataFrame
print(encoded_df)
##    Name  Gender_Female  ...  Country_Canada  Country_USA
## 0  John              0  ...               0            1
## 1  Mike              0  ...               1            0
## 2  Sara              1  ...               0            0
## 
## [3 rows x 6 columns]

Conclusion

We have successfully performed one-hot encoding on categorical columns using Pandas in Python. This encoded data can now be easily used for training machine learning models.