Building my first project: CLI Data Gem πŸ’Ž

jacquelinelam

Jacqueline Lam

Posted on November 3, 2019

Building my first project: CLI Data Gem πŸ’Ž

A CLI App for Indecisive Shoppers

As an indecisive yet frugal shopper, I struggle a lot in making my purchase decisions. I love to do extensive research to find the product with the best features for its cost.

Hence, I decided to create a gem that will provide the user with concise information about a type of product and gives the user the option to learn about the best products based on a chosen feature! β€” Almost like a product rater of some sort.

I visited Outdoor Gear Lab's Best Women's Rain Jackets of 2019 website dozens of times this year before I finally decided on the perfect jacket for me. Once I decided to scrape from this website, I started to imagine my CLI app interface according to features that a user might be interested in, and then reverse-engineer the data that I would need to scrape from the website:

Planning a CLI for Best Rain Jackets:

  • Shows a list of best female rain jackets #name - price - overall rating
  • Shows details of a chosen jacket
  • Shows the best products based on a chosen rating category

So, what are the properties of a rain jacket? It has a...

  • Name
  • Price
  • Description
  • Pros & Cons
  • URL to full product review
  • Overall rating
  • Other rating categories (water resistance, breathability, comfort, weight, durability, and packed size)

Once I set up the basic structure of my gem and created the executable files, I started working on my Scraper class.

Scraping data

First, I used Open-URI to scrape the website's HTML and turned it into a string. Then, I used Nokogiri::HTML to parse the HTML string into a NodeSet. I identified the jacket properties that I wanted to assign to each jacket instance, and I began using Nokogiri's CSS method to retrieve selected data from the jacket table.

Alt Text
In the screenshot above, each <tr> tag (table row) contains a collection of one property for all jackets, while each <td> (field) contains the property element of one jacket instance.

As I iterated through each <tr> tags /table row:

  1. Row #1: Instantiated a new jacket instance at each <td> (field)
  2. If the row contained a property that I needed for the jacket object, I would scrape the values of respective attributes and assign them to each jacket object
  3. Saved all the jacket instances (with assigned attributes) in an all_jackets array

Converting Scraped Data to Objects

At this stage, I had mapped out my domain and the decoupled my namespaced classes: RainJackets::CLI, RainJackets::Jacket, and RainJackets::Scraper. The CLI object will interact with the Scraper class, which will be responsible for creating Jacket instances.

When the user runs the gem, an instance of the CLI class would be created -> The CLI controller would call the Scraper class method initialize_jacket_objects -> where the Scraper would parse the info for all jackets -> and lastly, create the Jacket instances and assign different attributes to the jackets.

Ruby Object-Oriented Programming was so powerful that it allowed me to encapsulate all the logic for my CLI interface into classes. It also encapsulated the abstraction of real-world objects in a class (i.e. a jacket with name, price, description, etc.)

Instead of using hashes and arrays to give the jacket object information, I used the Jacket class to store a jacket's attributes and methods. The attr_acessor macro allowed me to read and write to the attributes, and a class variable @@all allowed me to save all the instances of my Jacket class in an array.

CLI Application

With a masterplan for my app's functionality, I couldn't wait to complete the rain jackets controller. The CLI class was responsible for getting the user's input, matching them to the available commands and displaying the results derived from the scraped data. I was a bit too stoked and made the mistake of writing everything in a #call method in my CLI class (Spaghetti code alert πŸβ—). This caused a lot of bugs as the code was hard to follow.

Alt Text

So, I took a deep breath and a step back. I broke down the #call method into individual methods based on each of its functions β€”Β i.e. #prompt_input, #get_input, #handle_input, #read_rating_input, #print_ratings, #print_list_all, #print_menu, etc. The last hurdles were to ensure that the #handle_input logic would prevent errors stemming from user inputs and ensure that the #print methods would successfully retrieve desired scraped data, reformat them and display meaningful results to the user.

And...Wa~la! The app is now in business!


Recap of My Process πŸ“

  1. Chose a website with abundant and useful information
  2. Imagined my interface and planned my gem
  3. Started with my project structure β€” created my GitHub repo and built a gem with bundler
  4. Created my bin file executable and set up my environment
  5. Understood the website's HTML structures and patterns, then developed algorithms to scrape data efficiently
  6. Converted the scraped data to objects
  7. Built the CLI interface
  8. Refactored my code and add comments
  9. Published the gem onto RubyGems.org

My Key Takeaways

  • Planning: Think about how my gem would work before starting to code and break down my goals into smaller tasks in order to avoid feeling overwhelmed
  • Testing my code with bin/console & binding.pry: A great way to replicate project-environment. Once I scraped the table from the rain jackets website, I used the console to test the CSS selectors that will allow me to retrieve desired pieces of info out of the HTML document and I used binding.pry for debugging
  • Rubber duck debugging: Revisit my code line-by-line and explain what it does often leads me to the bug (often a silly syntax error like a missing end or closing parenthesisπŸ™„)
  • Program architecture and design patterns: This can be very challenging since it comes with experience. Although my code runs without errors, there is always a more organized, abstract and efficient alternative to mine. I found it helpful to watch walkthroughs and video demos (especially by Flatiron School's founder, Avi Flombaum) to learn how professionals would approach problems.
  • DRY-ing up my code: Writing long methods with multiple functions (a.k.a spaghetti code) makes debugging really hard. A great solution is to remind myself to limit each method to one (of few) functionality.
  • Refactoring: As much as I would love to write abstract code, it is okay to write long hard-code first, and then refactor my code to become more dynamic and abstract. Some useful methods include #send, #include?, and enumerators like #map and #each_with_index.

Alt Text


What's next?

Now that I have built the basis of my CLI app, I am excited to add additional features, such as:

  • Scraping Outdoor Gear Lab's "Best Men's Rain Jackets of 2019" webpage
    • Provide the options for reviewing men's or women's rain jackets in CLI
  • Providing a more customized user-experience:
    • Allow user to compare two jackets by selecting multiple properties and rating categories that they are interested in, then display the results side-by-side.

Feel free to explore my project on Github or on rubygems.org and share your feedback with me! Thanks for reading!

πŸ’– πŸ’ͺ πŸ™… 🚩
jacquelinelam
Jacqueline Lam

Posted on November 3, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related