Željko Šević
Posted on August 26, 2023
Puppeteer is a headless browser for automating browser tasks. Here's the list of some of the features:
- Turn off headless mode
const browser = await puppeteer.launch({
headless: false
// ...
});
- Resize the viewport to the window size
const browser = await puppeteer.launch({
// ...
defaultViewport: null
});
- Emulate screen how it's shown to the user via the
emulateMediaType
method
await page.emulateMediaType('screen');
- Save the page as a PDF file with a specified path, format, scale factor, and page range
await page.pdf({
path: 'path.pdf',
format: 'A3',
scale: 1,
pageRanges: '1-2',
printBackground: true
});
- Use preexisting user's credentials to skip logging in to some websites. The user data directory is a parent of the
Profile Path
value from thechrome://version
page.
const browser = await puppeteer.launch({
userDataDir: 'C:\\Users\\<USERNAME>\\AppData\\Local\\Google\\Chrome\\User Data',
args: [],
});
- Use Chrome instance instead of Chromium by utilizing the
Executable Path
from thechrome://version
URL. Close Chrome browser before running the script
const browser = await puppeteer.launch({
executablePath: puppeteer.executablePath('chrome'),
// ...
});
- Get value based on evaluation in the browser page
const shouldPaginate = await page.evaluate((param1, param2) => {
// ...
}, param1, param2);
- Get HTML content from the specific element
const html = await page.evaluate(
() => document.querySelector('.field--text').outerHTML,
);
- Wait for a specific selector to be loaded. You can also provide a timeout in milliseconds
await page.waitForSelector('.success', { timeout: 5000 });
- Manipulate with a specific element and click on some of the elements
await page.$eval('#header', async (headerElement) => {
// ...
headerElement
.querySelectorAll('svg')
.item(13)
.parentNode.click();
});
- Extend execution of the
$eval
method
const browser = await puppeteer.launch({
// ...
protocolTimeout: 0,
});
- Manipulate with multiple elements
await page.$$eval('.some-class', async (elements) => {
// ...
});
- Wait for navigation (e.g., form submitting) to be done
await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 0 });
- Trigger hover event on some of the elements
await page.$eval('#header', async (headerElement) => {
const hoverEvent = new MouseEvent('mouseover', {
view: window,
bubbles: true,
cancelable: true
});
headerElement.dispatchEvent(hoverEvent);
});
- Expose a function in the browser and use it in
$eval
and$$eval
callbacks (e.g., simulate typing using thewindow.type
function)
await page.exposeFunction('type', async (selector, text, options) => {
await page.type(selector, text, options);
});
await page.$$eval('.some-class', async (elements) => {
// ...
window.type(selector, text, { delay: 0 });
});
- Press the
Enter
button after typing the input field value
await page.type(selector, `${text}${String.fromCharCode(13)}`, options);
- Remove the value from the input field before typing the new one
await page.click(selector, { clickCount: 3 });
await page.type(selector, text, options);
- Expose a variable in the browser by passing it as the third argument for
$eval
and$$eval
methods and use it in$eval
and$$eval
callbacks
await page.$eval(
'#element',
async (element, customVariable) => {
// ...
},
customVariable
);
- Mock response for the specific request
await page.setRequestInterception(true);
page.on('request', async function (request) {
const url = request.url();
if (url !== REDIRECTION_URL) {
return request.continue();
}
await request.respond({
contentType: 'text/html',
status: 304,
body: '<body></body>',
});
});
- Intercept page redirections (via interceptor) and open them in new tabs rather than following them in the same tab
await page.setRequestInterception(true);
page.on('request', async function (request) {
const url = request.url();
if (url !== REDIRECTION_URL) {
return request.continue();
}
await request.respond({
contentType: 'text/html',
status: 304,
body: '<body></body>',
});
const newPage = await browser.newPage();
await newPage.goto(url, { waitUntil: 'domcontentloaded', timeout: 0 });
// ...
await newPage.close();
});
- Intercept page response
page.on('response', async (response) => {
if (response.url() === RESPONSE_URL) {
if (response.status() === 200) {
// ...
}
// ...
}
});
Boilerplate
Here is the link to the boilerplate I use for the development.
💖 💪 🙅 🚩
Željko Šević
Posted on August 26, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.