Browser automation with Puppeteer

zsevic

Željko Šević

Posted on August 26, 2023

Browser automation with Puppeteer

Puppeteer is a headless browser for automating browser tasks. Here's the list of some of the features:

  • Turn off headless mode
  const browser = await puppeteer.launch({
    headless: false
    // ...
  });
Enter fullscreen mode Exit fullscreen mode
  • Resize the viewport to the window size
  const browser = await puppeteer.launch({
    // ...
    defaultViewport: null
  });
Enter fullscreen mode Exit fullscreen mode
  • Emulate screen how it's shown to the user via the emulateMediaType method
  await page.emulateMediaType('screen');
Enter fullscreen mode Exit fullscreen mode
  • Save the page as a PDF file with a specified path, format, scale factor, and page range
  await page.pdf({
    path: 'path.pdf',
    format: 'A3',
    scale: 1,
    pageRanges: '1-2',
    printBackground: true
  });
Enter fullscreen mode Exit fullscreen mode
  • Use preexisting user's credentials to skip logging in to some websites. The user data directory is a parent of the Profile Path value from the chrome://version page.
  const browser = await puppeteer.launch({
    userDataDir: 'C:\\Users\\<USERNAME>\\AppData\\Local\\Google\\Chrome\\User Data',
    args: [],
  });
Enter fullscreen mode Exit fullscreen mode
  • Use Chrome instance instead of Chromium by utilizing the Executable Path from the chrome://version URL. Close Chrome browser before running the script
  const browser = await puppeteer.launch({
    executablePath: puppeteer.executablePath('chrome'),
    // ...
  });
Enter fullscreen mode Exit fullscreen mode
  • Get value based on evaluation in the browser page
  const shouldPaginate = await page.evaluate((param1, param2) => {
    // ...
  }, param1, param2);
Enter fullscreen mode Exit fullscreen mode
  • Get HTML content from the specific element
  const html = await page.evaluate(
    () => document.querySelector('.field--text').outerHTML,
  );
Enter fullscreen mode Exit fullscreen mode
  • Wait for a specific selector to be loaded. You can also provide a timeout in milliseconds
  await page.waitForSelector('.success', { timeout: 5000 });
Enter fullscreen mode Exit fullscreen mode
  • Manipulate with a specific element and click on some of the elements
  await page.$eval('#header', async (headerElement) => {
    // ...
    headerElement
      .querySelectorAll('svg')
      .item(13)
      .parentNode.click();
  });
Enter fullscreen mode Exit fullscreen mode
  • Extend execution of the $eval method
  const browser = await puppeteer.launch({
    // ...
    protocolTimeout: 0,
  });
Enter fullscreen mode Exit fullscreen mode
  • Manipulate with multiple elements
  await page.$$eval('.some-class', async (elements) => {
    // ...
  });
Enter fullscreen mode Exit fullscreen mode
  • Wait for navigation (e.g., form submitting) to be done
  await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 0 });
Enter fullscreen mode Exit fullscreen mode
  • Trigger hover event on some of the elements
  await page.$eval('#header', async (headerElement) => {
    const hoverEvent = new MouseEvent('mouseover', {
      view: window,
      bubbles: true,
      cancelable: true
    });

    headerElement.dispatchEvent(hoverEvent);
  });
Enter fullscreen mode Exit fullscreen mode
  • Expose a function in the browser and use it in $eval and $$eval callbacks (e.g., simulate typing using the window.type function)
  await page.exposeFunction('type', async (selector, text, options) => {
    await page.type(selector, text, options);
  });

  await page.$$eval('.some-class', async (elements) => {
    // ...
    window.type(selector, text, { delay: 0 });
  });
Enter fullscreen mode Exit fullscreen mode
  • Press the Enter button after typing the input field value
  await page.type(selector, `${text}${String.fromCharCode(13)}`, options);
Enter fullscreen mode Exit fullscreen mode
  • Remove the value from the input field before typing the new one
  await page.click(selector, { clickCount: 3 });
  await page.type(selector, text, options);
Enter fullscreen mode Exit fullscreen mode
  • Expose a variable in the browser by passing it as the third argument for $eval and $$eval methods and use it in $eval and $$eval callbacks
  await page.$eval(
    '#element',
    async (element, customVariable) => {
      // ...
    },
    customVariable
  );
Enter fullscreen mode Exit fullscreen mode
  • Mock response for the specific request
  await page.setRequestInterception(true);
  page.on('request', async function (request) {
    const url = request.url();
    if (url !== REDIRECTION_URL) {
      return request.continue();
    }

    await request.respond({
      contentType: 'text/html',
      status: 304,
      body: '<body></body>',
    });
  });
Enter fullscreen mode Exit fullscreen mode
  • Intercept page redirections (via interceptor) and open them in new tabs rather than following them in the same tab
  await page.setRequestInterception(true);
  page.on('request', async function (request) {
    const url = request.url();
    if (url !== REDIRECTION_URL) {
      return request.continue();
    }

    await request.respond({
      contentType: 'text/html',
      status: 304,
      body: '<body></body>',
    });
    const newPage = await browser.newPage();
    await newPage.goto(url, { waitUntil: 'domcontentloaded', timeout: 0 });
    // ...
    await newPage.close();
  });
Enter fullscreen mode Exit fullscreen mode
  • Intercept page response
  page.on('response', async (response) => {
    if (response.url() === RESPONSE_URL) {
      if (response.status() === 200) {
        // ...
      }
      // ...
    }
  });
Enter fullscreen mode Exit fullscreen mode

Boilerplate

Here is the link to the boilerplate I use for the development.

💖 💪 🙅 🚩
zsevic
Željko Šević

Posted on August 26, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related