Converting HTML web pages into PDF

whoakarsh

Akarsh Jaiswal

Posted on May 26, 2024

Converting HTML web pages into PDF

In this article, I will guide you through the straightforward process of converting HTML web pages into PDF documents using Puppeteer. This Node.js library provides a user-friendly API to control Chrome or Chromium over the DevTools Protocol.

Prerequisites

Before I start, ensure you have Node.js and npm installed on your machine. Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine, and npm is the package manager for the Node.js platform. If not, you can download and install Node.js from the official website (https://nodejs.org/en/download), where the Node.js package manager is included in the Node.js distribution.

You can verify the installation by running the following commands in your terminal:

node --version
npm --version
Enter fullscreen mode Exit fullscreen mode

Step 1: Initialize a new Node.js project

First, create a new directory for your project and navigate into it:

mkdir html-to-pdf-demo
cd html-to-pdf-demo
Enter fullscreen mode Exit fullscreen mode

Then, initialize a new Node.js project by running:

npm init -y
Enter fullscreen mode Exit fullscreen mode

This will create a new ‘package.json* file in your project directory.

Step 2: Install Puppeteer

Next, install Puppeteer by running:

npm install puppeteer
Enter fullscreen mode Exit fullscreen mode

This will download a recent version of Chromium, a headless browser that Puppeteer controls.

Step 3: Write the script

Create a new index.js file in your project directory and open it in your text
editor. Then, paste the following code:

const puppeteer =
require('puppeteer');
async function printPDF() {

 const browser = await puppeteer.launch();
 const page = await browser.newPage();

 await page.goto (http://
 marvel2950.github.io, {waitUntil:
 'networkidle0'});

 const pdf = await
 page.pdf ({ format: 'A4' });

 await browser.close();

 return pdf;

}
Enter fullscreen mode Exit fullscreen mode
printPDF().then (pdf => {
require('fs') .writeFileSync('output.pdf', pdf);
});
Enter fullscreen mode Exit fullscreen mode

This script launches a new browser instance, opens a new page, navigates to http://marvel2950.github.io, and generates a PDF. The ‘{waitUntil: ‘networkidle0’}’ option ensures that the ‘page.goto’ function waits until there are no more than 0 network connections for at least 500 ms.

Step 4: Run the script

node index.js
Enter fullscreen mode Exit fullscreen mode

And that’s it! This will create a new PDF document named ‘output.pdf’ in your project directory. This file is the result of the PDF generation process and contains the content of the HTML web page in a PDF format.

💖 💪 🙅 🚩
whoakarsh
Akarsh Jaiswal

Posted on May 26, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Converting HTML web pages into PDF
automation Converting HTML web pages into PDF

May 26, 2024