Dom Traversal for Fun and Profit
Calvin Torra
Posted on June 22, 2021
During my time writing funny words in an IDE to make the computer do what I want, I dabbled in a little web scraping for cash.
I kept forgetting how to target certain parts of the page that I wanted to scrape and organise within my program.
So below, I'm putting together a few notes to share with my future self and you :)
Let's start with a little boilerplate HTML that we can work with.
<div class="grandparent" id="grandparent-id">
<!-- top level grandparent -->
<div class="parent"> <!-- first parent -->
<div class="child" id="child-one"></div> <!-- child 1 -->
<div class="child"></div> <!-- child 2 -->
</div>
<div class="parent"> <!-- second parent -->
<div class="child"></div> <!-- child 3 -->
<div class="child" id="child-four"></div> <!-- child 4 -->
</div>
</div>
Get Element by ID
There should only be one unique ID name per page. So we call getElement (singular).
const grandparent = document.getElementById("grandparent-id")
Get Elements by Class Name
Calling get elements (plural) returns an HTMLCollection of elements from the DOM (both the parents in the HTML above). However, when trying to use Array methods on this collection you'll get an error.
We can get around this by wrapping the returned collection of elements inside an array, then we're able to use array methods on that content.
const parent = Array.from(document.getElementsByClassName("parent"))
Query Selector
This gives us a single element (the first one that appears in the DOM tree) by targeting the DOM using CSS selectors.
const grandparent = document.querySelector("#grandparent-id") // id
const grandparent = document.querySelector(".grandparent") // class
Query Selector All
Similar to Get Elements by ID, this gives all the elements that match our query. However, this returns a NodeList, which allows us to use Array methods.
const grandparent = document.querySelectorAll("#grandparent-id") // id
const grandparent = document.querySelectorAll(".grandparent") // class
Selecting Child
Element
First, we want to target the top grandparent node. From there we can grab all of the children underneath.
Even though we're using QuerySelector which usually gives us a NodeList, when calling on the children, we get back an HTMLCollection!! Annoying.
So we'll need to create an Array from the returned children.
const grandparent = document.querySelector(".grandparent")
const parents = Array.from(grandparent.children)
const parentOne = parents[0] // etc
We can also drill down into the parent's children
const children = parentOne.children
Selecting Parent
Element
We can use QuerySelector on NodeLists that we've already captured to go straight to the child level and skip the parents.
const childFour = document.querySelector("#child-four")
const parent = childFour.parent
Selecting Closest
Grandparent Element
This works very similar to QuerySelector, but instead of going down the DOM it moves upwards.
It takes a CSS argument which moves up the DOM to find the closest element that has the passed selector.
const childFour = document.querySelector("#child-four")
const grandparent = childFour.closest(".grandparent")
Skipping DOWN half the DOM
We can use QuerySelector on NodeLists that we've already captured to go straight to the child level and skip the parents.
const grandparent = document.querySelector(".grandparent")
const childOne = grandparent.querySelector(".child")
Selecting Siblings
Previous + Next
This gets the next element along from where you currently are. Instead of going up and down, it's like we're going sideways through the DOM.
const childOne = document.querySelector("#child-one")
const childTwo = childOne.nextElementSibling
const childFour = document.querySelector("#child-four")
const childThree = childFour.previousElementSibling
Posted on June 22, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.