Journey In Learning Pandas
Bro Karim
Posted on November 8, 2024
This is note a course or tutorial, this is just my own note that i create while learning python3
## Course i use (it's free)
firts of all, i am not using any course to learn only this usefull website : datawars.io
Install Jupyter
But before i am learn in there, i already know therea soeme tools i need to install in my device link python3 and jupyter notebook. since i am using macbook i follow this tutorial
https://youtu.be/pkjtbnsX7Yw?si=kq85ZuhM1wLYle8S
point penting
- A virtual environment is a self-contained directory that contains its own Python installation, along with its own set of installed packages. This is useful for managing dependencies and keeping your projects isolated from one another.
- Running the command
source bin/activate
is the next step to activate your virtual environment - So whenever you want to start jupyter notebook, you must go to the fodlder, run
source bin/activate
thanjupyter notebook
at leasth this step in macbook
point penting
- Jupyter is cell-based
- It's free and open source software. Anybody can inspect the source code, modify it and use it for free
- It's language agnostic. You can use Python, R, Julia or pretty much any other language
- It's web based, which means it can run on any browser
- It's a mature project with a huge community.
second tool : visidata
VisiData is an interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, into a lightweight utility which can handle millions of rows with ease.
Its hellp you help to explore, clean, edit, and restructure tabular data directly from the terminal. It supports a wide range of data formats, including CSV files, Excel spreadsheets, SQL databases, and more. With VisiData, you can interactively select, filter, and group rows, rearrange and transform columns, and even create ad-hoc data pipelines. Its intuitive interface and extensive keyboard shortcuts make it an efficient tool for data analysis and manipulation.
Installation & more : https://www.visidata.org/docs/
To run the visidata, just typer :
vd [options] [input ...]
Intro to Pandas
Pandas is our swiss knife when it comes to Data Analysis/Science in Python. We use it to:
- Load/dump read/write data: to and from different formats (CSV, XML, HTML, Excel, JSON, even from the Internet)
- Analyze data: perform statistical analysis, query the data, find inconsistencies, etc
- Data cleaning: finding missing values, duplicate data, invalid or broken values, etc
- Visualizations: with support from matplotlib, we can quickly visualize data
- Data Wrangling/Munging: a non-so-scientific term that involves data handling: merging multiple data sources, creating derived representations, grouping data, etc.
Other resuorce that help me much to learn pandas is this repo : https://github.com/TirendazAcademy/PANDAS-TUTORIAL/tree/main
Posted on November 8, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 27, 2024