How to make a web scraper with JavaScript
The Vik
Posted on August 4, 2021
In this blog I will teach how to make a web scraper with axios and cheerio.
const axios = require('axios')
const cheerio = require('cheerio')
// Replace the url with your url
const url = 'https://www.premierleague.com/stats/top/players/goals?se=-1&cl=-1&iso=-1&po=-1?se=-1'
axios(url)
.then(response => {
const html = response.data
const $ = cheerio.load(html)
const statsTable = $('.statsTableContainer > tr')
const statsData = []
statsTable.each(function() {
const rank = $(this).find('.rank > strong').text()
const playerName = $(this).find('.playerName > strong').text()
const nationality = $(this).find('.playerCountry').text()
const mainStat = $(this).find('.mainStat').text()
statsData.push({
rank,
playerName,
nationality,
mainStat
})
})
// Will print the collected data
console.log(statsData)
})
// In case of any error it will print the error
.catch(console.error)
Wosh
thats a lot of code lets get it one by one
npm install axios cheerio --save
to install or of the required dependencies
const axios = require('axios')
const cheerio = require('cheerio')
this will import those installed dependencies
const url = 'https://www.premierleague.com/stats/top/players/goals?se=-1&cl=-1&iso=-1&po=-1?se=-1'
this is the url from which we will scrap the data, you can change
it if you want but will have to change more things then
axios(url)
.then(response => {
const html = response.data
const $ = cheerio.load(html)
const statsTable = $('.statsTableContainer > tr')
const statsData = []
}
at the first line we are calling axios and url we are then adding .then function and passing response in it.
then we are making a const named html and passing response.data
if you now use
console.log(html)
then it will print the whole html code of the website.
okay so now we are making a const named $ and then loading the html with cheerio.
now making a const name statsTable and passing ( with $ = cheerio )the class of the div from which we are going to scrap the data.
now are are making a statsData in which we will store the scraped data.
statsTable.each(function() {
// If you replaced the url then you have to replace these too
const rank = $(this).find('.rank > strong').text()
const playerName = $(this).find('.playerName > strong').text()
const nationality = $(this).find('.playerCountry').text()
const mainStat = $(this).find('.mainStat').text()
statsData.push({
rank,
playerName,
nationality,
mainStat
})
})
// this code should be inside .then(responde => {}) which be made above
okay now we are just finding the specific div to scrap the data and then converting it to text using .text()
also then we are pushing those specific div's text to statsData which we also made above.
now we have to just use
console.log(statsData) // inside .then(responde => {})
and it should show all of the scraped data.
and at last when everything is closed }) we will
.catch(console.error)
which will print the error if we have one and done.
this is my first time explaining a code so idk how I did.
THANKS
Posted on August 4, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 5, 2024