The Ultimate Guide to Getting Started in Data Science
Nelson chege
Posted on April 2, 2022
believing that Artificial Intelligence and Machine learning is the next big step in the evolution of computer Technology, I found no better time to start learning Data Science than now.
thanks to @LuxAcademy and @DataFestAfrica I now have started my journey to be a data scientist
Having completed our first week of the Data Science and Machine Learning Bootcamp Marathon, Here are the topics i have been able to learn during that time:
introduction to python
having experience in python for sometime, this was a good refresher of some basic concepts that I had forgotten. This also acted as a reminder of the things I used to consider hard but now they are like a breeze,(to all who think its hard, just give it time)
anaconda and jupyter notebook installation
for a data scientist having anaconda installed on your machine will help you a lot. Because anaconda comes pre-installed with jupyter notebook, it was easy to get everything running very fast
relational database
having been in the market for approximately over 40 years, relational database have a way of storying data that are related to each other, it might be alit bit more work with designing and implementing the database compared to non-relational database, but querying the data from the database makes it worth all the work done before
introduction to python
having experience in python for some time, this was a good refresher of some basic concepts that I had forgotten. This also acted as a reminder of the things I used to consider hard but now they are like a breeze, (to all who think it’s hard, just give it time)
anaconda and jupyter notebook installation
for a data scientist having anaconda installed on your machine will help you a lot. Because anaconda comes pre-installed with jupyter notebook, it was easy to get everything running very fast
relational database
having been in the market for approximately over 40 years, relational database have a way of storying data that are related to each other, it might be alit bit more work with designing and implementing the database compared to non-relational database, but querying the data from the database makes it worth all the work done before
NumPy
Python having list, it has its advantages and disadvantages. One of these disadvantages is that it is slow to work on. that’s where NumPy comes in, it’s a python package that contains arrays that can be used.it is the faster compared to list because the package is implemented in c code. The NumPy is mostly used by other packages that I am going to discuss in this blog
Pandas
Pandas extends the array from NumPy into two major parts: series and dataframe. Of the two the mostly used is dataframe. You can think of dataframe as a matrix like shape
Matplotlib
As humans, we are more visual creatures. That’s were Matplotlib comes in, this a package that has inbuild graphs that are used to show visual representation of data
*seaborn *
As mentioned earlier, we are visual creatures and we also that visual representation to look nice. Seaborn is a packages that extends the matplotlib library and adds more styling on to the graphs
*byForest *
This is a package that contains other packages. once you have installed this package, you can use all the packages that are in it by importing it
PostgreSQL Connection
Having PostgreSQL connecting to your python script looks like a huge mountain to climb but it’s very easy, with a few lines of code you can connect your PostgreSQL database to your python script or project
Having just completed a week of the Data Science and Machine Learning Bootcamp Marathon.it is exciting how I was able to learn all this within a week. I won’t say I have mastered everything here but I can say that I have learnt the basic tools required in my Data Science career.
Here are some of the resources for the topics discussed above:
Get Started with Pandas In 5 mins: https://medium.com/bhavaniravi/python-pandas-tutorial-92018da85a33
A complete guide on NumPy for data science: https://medium.com/nerd-for-tech/a-complete-guide-on-numpy-for-data-science-c54f47dfef8d
An Introduction to Matplotlib: https://www.simplilearn.com/tutorials/python-tutorial/matplotlib
A Beginner’s Guide to matplotlib for Data Visualization and Exploration in Python: https://medium.com/analytics-vidhya/a-beginners-guide-to-matplotlib-for-data-visualization-and-exploration-in-python-3fb32d03c3cd
Seaborn — A Step by Step Guide to Catch Your Audience Using Data Visualization: https://python.plainenglish.io/seaborn-a-step-by-step-guide-to-catch-your-audience-part-1-42d9e6e30bea
Starting with Matplotlib and Seaborn: https://medium.datadriveninvestor.com/starting-with-matplotlib-and-seaborn-cba16c7beabf
Understand theft to use pyforest to simply package import.
Auto Import Python Libraries ( Using Pyforest to import important python libraries ) : https://towardsdatascience.com/auto-import-
python-libraries-d095a11b4cca
How to connect to a Postgres database with Python : https://medium.com/analytics-vidhya/how-to-setup-a-python-application-with-a-postgres-database-f965e7c1581e
thanks for reading
Happy Coding
Posted on April 2, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.