User Agent string difference in Puppeteer headless and headful

sonyarianto

Sony AK

Posted on November 29, 2019

User Agent string difference in Puppeteer headless and headful

Today I will talk about the User Agent difference when we running Puppeteer in headless and headful mode.

For people not familiar with Puppeteer, Puppeteer is a Node library that provides many high-level API to control the headless Chrome or Chromium over DevTools protocol. You can go to https://pptr.dev/ for more details.

Puppeteer in headless mode means you control Chrome or Chromium browser without displaying the browser UI. In the opposite, Puppeteer in headful mode will display the browser UI and this is useful for debugging.

As mentioned here https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent, User Agent string is a characteristic string that allows the network protocol peers to identify the application type, operating system, software vendor or software version of the requesting software user agent.

Web browser send User-Agent request header when we browse a web pages on the internet. Here is sample of my User Agent.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36

Preparation

Install Puppeteer with this command.

npm i puppeteer
Enter fullscreen mode Exit fullscreen mode

The code

OK now let's create a code to show User Agent string when running Puppeteer in headless mode.

File puppeteer_headless.js

const puppeteer = require('puppeteer');

(async () => {
        const browser = await puppeteer.launch();

        console.log(await browser.userAgent());

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Run it.

node puppeteer_headless.js
Enter fullscreen mode Exit fullscreen mode

On my machine it will display like below.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/79.0.3945.0 Safari/537.36

Please notice there is sub string HeadlessChrome there.

OK now let's create a code to show User Agent string when running Puppeteer in headful mode.

File puppeteer_headful.js

const puppeteer = require('puppeteer');

(async () => {
        const browser = await puppeteer.launch({ headless: false });

        console.log(await browser.userAgent());

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Run with

node puppeteer_headful.js
Enter fullscreen mode Exit fullscreen mode

On my machine it will display like below.

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.0 Safari/537.36

Now we can see that this User Agent string is similar like normal web browser User Agent string.

Why this is interesting? Suppose you want to scrap a website using Puppeteer in headless mode and the target website put a protection by detecting the User Agent string (blocking ChromeHeadless) then your scraping activity might be blocked.

How to set User Agent on headless Chrome

Anyway we still can set User Agent string in Puppeteer headless mode, it will override the default headless Chrome User Agent string.

Here is the code sample.

File puppeteer_set_user_agent.js

const puppeteer = require('puppeteer');

(async () => {
        // prepare for headless chrome
        const browser = await puppeteer.launch();
        const page = await browser.newPage();

        // set user agent (override the default headless User Agent)
        await page.setUserAgent('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36');

        // go to Google home page
        await page.goto('https://google.com');

        // get the User Agent on the context of Puppeteer
        const userAgent = await page.evaluate(() => navigator.userAgent );

        // If everything correct then no 'HeadlessChrome' sub string on userAgent
        console.log(userAgent);

        await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

It will display User Agent that we already set before we browse to Google web page.

Thank you and I hope you enjoy it.

💖 💪 🙅 🚩
sonyarianto
Sony AK

Posted on November 29, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related