Pandas Data Manipulation: A Comprehensive Guide
Labby
Posted on August 15, 2024
Introduction
This lab will guide you on how to read, write, and manipulate data using Pandas, a powerful data analysis and manipulation library for Python. We will use a dataset from the Titanic shipwreck for this exercise.
VM Tips
After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.
Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.
If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.
Importing Necessary Libraries
First, we need to import the necessary libraries for our task. For this lab, we will only need pandas.
# Importing pandas library
import pandas as pd
Reading Data From CSV
The next step is to read the data from a CSV file. We will use the read_csv
function from pandas to do this.
# Reading data from CSV file
titanic = pd.read_csv("data/titanic.csv")
Checking the Data
After reading the data, it's always a good idea to check what it looks like. We will display the first few rows of the DataFrame.
# Displaying the first few rows of the DataFrame
titanic.head()
Checking the Data Types
We can check the data types of each column using the dtypes
attribute of the DataFrame.
# Checking the data types of each column
titanic.dtypes
Writing Data to Excel
You can also write the data to an Excel file using the to_excel
method. Let's save our DataFrame to an Excel file.
# Saving DataFrame to an Excel file
titanic.to_excel("titanic.xlsx", sheet_name="passengers", index=False)
Reading Data From Excel
Reading data from an Excel file is as easy as reading data from a CSV file. We will use the read_excel
function from pandas.
# Reading data from an Excel file
titanic = pd.read_excel("titanic.xlsx", sheet_name="passengers")
Checking DataFrame Information
The info
method provides a technical summary of a DataFrame. This can be useful to check the data types, number of non-null values, and memory usage.
# Checking DataFrame information
titanic.info()
Summary
In this lab, we learned how to read and write data using pandas, and how to check a DataFrame's information. Pandas provides a wide range of functionalities for handling and manipulating data, making it a powerful tool for data analysis.
🚀 Practice Now: Pandas Data Manipulation
Want to Learn More?
- 🌳 Learn the latest Pandas Skill Trees
- 📖 Read More Pandas Tutorials
- 💬 Join our Discord or tweet us @WeAreLabEx
Posted on August 15, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.