For the last two months, I have been spending many, many hours working on an open source Python library (instascrape). Despite all this time spent, the library's codebase is still relatively small.
So where were all these many, many hours working spent? Designing an API people actually want to use.
For every hour I spend coding, I spend like five hours doing work on the library that isn't coding. Despite loving to code more than anything, I love seeing others love to code as much as I do.
Knowing that the API for a project I designed isn't an absolute dumpster fire is what helps me sleep at night.
The earliest iteration of instascrape
The earliest version of instascrape started in Summer 2019. I had just learned Instagram's Developer API was about to dry up and thus the idea for instascrape was born.
I did some research, pip installed selenium, figured out how the heck to configure chromedriver, wrote some code, and the result was *drumroll*....
terrible.
It was clunky, the code was awful, and it was sloooooowwwwwwwwwww. Despite this, I was able to get a taste for what was possible and thus the earliest version of what would become instascrape was born.
Fast forward a year later
Before I wrote a single line of code for the version of instascrape that exists today, I knew I had to design it to be as easy to use as possible.
There was only one problem, I had never designed an API before and didn't know where to begin.
So I started hacking away: I wrote some code, I hated it, I did some research, I wrote some more code, I hated it some more, etc... this went on for a week or so until I had an epiphany and knew that I was setting my goals way too high and had to bring myself back down to Earth.
import this
importthis
I am a firm believer in The Zen of Python and realized that my API was starting to stray quite far from my beliefs on how Python code should feel.
I scrapped everything and started once again from scratch; this time reminding myself of The Zen of Python every step of the way. Anytime something in the API felt even a little nonobvious I immediately took a step back, reevaluated, and fixed it.
Because of this, the development process has been way slower than I'm used to but it has absolutely paid off. While there is always going to be room for improvement, I am comfortable with knowing that others are using (and judging) this API.
That's all folks!
It has been a painstaking process and there is nothing worse than looking at the code you've written and realizing it has to be entirely rewritten from scratch... but that's life!
Being able to recognize something isn't working and accepting constructive criticism in stride is a skill that I've been learning the hard way the last two months and it has been beyond fulfilling and enlightening in all aspects of my life.
It's taught me to be deliberate with design choices, move slowly and carefully, and to view my code from the shoes of the average user that has never seen my code before.
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
instascrape: powerful Instagram data scraping toolkit
Note: This module is no longer actively maintained.
DISCLAIMER:
Instagram has gotten increasingly strict with scraping and using this library can result in getting flagged for botting AND POSSIBLE DISABLING OF YOUR INSTAGRAM ACCOUNT. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it. I don't claim any responsibility if your Instagram account is affected by how you use this library.
What is it?
instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.