Web Scraping Script in Ruby

sulmanweb

Sulman Baig

Posted on October 24, 2019

Web Scraping Script in Ruby

I was working on a project and I had to scrape a web page so I look into the options and I found Nokogiri.
Nokogiri is an HTML, XML, SAX, and Reader parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors.
To get the document I used HTTParty.
HTTParty makes http fun! Also, makes consuming restful web services dead easy.
For this example, I will be scrapping https://rubygems.org/search?query=%s.

Script

The final script is given below:

require 'HTTParty'
require 'Nokogiri'

class RubygemsScrapper
  attr_accessor :parse_page

  # initialize repo for ruby gems requires query string
  def initialize(q)
    doc = HTTParty.get("https://rubygems.org/search?query=#{q}")
    @parse_page ||= Nokogiri::HTML(doc)
  end

  # get the first result's version or if not found returns -1
  def get_latest_version
    begin
      parse_page.css('.gems__gem').css('.gems__gem__version').children[0].text
    rescue
      -1 # Not found
    end
  end

  # get the first result's link to ruby gems org or if not found returns -1
  def get_link
    begin
      "https://rubygems.org" + parse_page.css('.gems__gem').attribute('href').value
    rescue
      -1 # not found
    end
  end

  # Calling scrapper
  scrapper = RubygemsScrapper.new('yiya')
  p scrapper.get_latest_version
  p scrapper.get_link
end
Enter fullscreen mode Exit fullscreen mode

This class would get the name of gem to be searched and returns the first element’s latest version and link to it.

💖 💪 🙅 🚩
sulmanweb
Sulman Baig

Posted on October 24, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Web Scraping
ruby Web Scraping

October 27, 2023

Web Scraping Script in Ruby
ruby Web Scraping Script in Ruby

October 24, 2019