Data loading with Pandas: Loading Excel , CSV , SQL, and any data file
Alisha Rana
Posted on August 22, 2022
Whether you want to begin with Data Analysis, fetch useful information, or predict something from data, the first step is always the data loading we will be using a pandas library.
We will use a Python tool called pandas to import data from either an Excel table or a SQL database.
Before getting into loading data, you must have pandas installed into your platform on which you are loading data.
I will be using Jupyter Notebook , you can easily get it in Anaconda
To install pandas run the following command in Jupyter Notebook cell:
!pip install pandas
Or else you can install in Python Environment as well, but that's not the focus of today.
This is first class we are touching the code , so open up Jupyter Notebook if you want to code along
I have some CSV and Excel file, I will go along with
Initially, you must import the installed library pandas.
import pandas
Writing this would be enough, but because we will be using pandas a lot usually we will give it a shorthand to some alias
import pandas as pd
pd is most common that people use, we execute the cell now we have Pandas in Python.
To import or read Data
You can enter pd.read in your Notebook and hit tab you can see different ways that you can load data with you will fine various way to load data ,in this we'll have a look at the most common ones
Import Excel Files
pd.read_excel("data/crypto.xlsx")
In the parenthesis you will be giving the location where your file is stored,
Now that loading has completed, you can see that you have data in a pandas dataframe
We didn't save it in a variable.
However, you can save data in a variable as well.
data=pd.read_excel("data/crypto.xlsx")
Import CSV Files
CSV files are slightly different because they contain raw data.
pd.read_csv("data/crypto.csv")
Loading Data From SQL
A great way to store data and make it available to data scientists is through SQL databases.
Most businesses avoid using Excel files since they can be duplicated.
In addition to pandas we have to import SQLAlchemy
SQLAlchemy is a package that helps Python programmes communicate with databases.
import sqlalchemy as sql
Below this will create the connection,its called an Engine, If you have PostgreSQL database, this should be the location of your database
connect=sql.create_engine("postgresql://scott:tiger@localhost/test")
Here we go read SQL Table
data = pd.read_sql_table("sales", connect)
Loading any Data Files
Pandas works great on structured data, but sometimes data comes in weird formats. This is the general way to work with data files in Python.
with open("data/crypto.csv", mode='r') as cryptocurr:
data = cryptocurr.read()
If you only want to read the data and not alter it, you'll indicate that. mode='r'
Then we will give file a name to open, here i am giving file name as cryptocurr
Now we have a block where our file is open, within this block create a variable and will use read function after that run the cell and call the variable to get execute.
data
Hurraaah we did it!!!!!
Loading data into pandas is extremely easy.
Try it out with your own data, if you have an excel file lying around on your computer, make sure you have data in your computer nothing gets out so you can just pd.read and get in your data and play around with.
Posted on August 22, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.