As an internet developer, you will have needed to generate a PDF file of an internet web page to share along with your shoppers, use it in shows, or add it as a brand new characteristic in your internet app. Irrespective of your purpose, Puppeteer, Google’s Node API for headless Chrome and Chromium, makes the duty fairly easy for you.
On this tutorial, we are going to see methods to convert internet pages into PDF with Puppeteer and Node.js. Let’s begin the work with a fast introduction to what Puppeteer is.
What’s Puppeteer, and why is it superior?
In Google’s personal phrases, Puppeteer is, “A Node library which offers a high-level API to regulate headless Chrome or Chromium over the DevTools Protocol.”
[Read: Meet the 4 scale-ups using data to save the planet]
What’s a headless browser?
Despite the fact that Puppeteer is principally a headless browser, you’ll be able to configure and use it as non-headless Chrome or Chromium.
What are you able to do with Puppeteer?
Puppeteer’s highly effective browser-capabilities make it an ideal candidate for internet app testing and internet scraping.
To call just a few use circumstances the place Puppeteer offers the right functionalities for internet builders,
- Generate PDFs and screenshots of internet pages
- Automate type submission
- Scrape internet pages
- Carry out automated UI exams whereas preserving the take a look at setting up-to-date.
- Producing pre-rendered content material for Single Web page Purposes (SPAs)
Arrange the challenge setting
You should utilize Puppeteer on the backend and frontend to generate PDFs. On this tutorial, we’re utilizing a Node backend for the duty.
Initialize NPM and arrange the standard Categorical server to get began with the tutorial.
Make sure that to put in the Puppeteer NPM bundle with the next command earlier than you begin.
Convert internet pages to PDF
Now we get to the thrilling a part of the tutorial. With Puppeteer, we solely want just a few traces of code to transform internet pages into PDF.
First, create a browser occasion utilizing Puppeteer’s
Then, we create a brand new web page occasion and go to the given web page URL utilizing Puppeteer.
Now we have set the
waitUntil choice to
networkidle0. After we use
networkidle0 possibility, Puppeteer waits till there aren’t any new community connections throughout the final 500 ms. It’s a method to decide whether or not the positioning has completed loading. It’s not precise, and Puppeteer provides different choices, however it is without doubt one of the most dependable for many circumstances.
Lastly, we create the PDF from the crawled web page content material and put it aside to our system.
The print to PDF operate is kind of difficult and permits for lots of customization, which is incredible. Listed below are a number of the choices we used:
- printBackground: When this selection is about to true, Puppeteer prints any background colours or photographs you might have used on the internet web page to the PDF.
- path: Path specifies the place to avoid wasting the generated PDF file. You too can retailer it right into a reminiscence stream to keep away from writing to disk.
- format: You’ll be able to set the PDF format to one of many given choices: Letter, A4, A3, A2, and many others.
- margin: You’ll be able to specify a margin for the generated PDF with this selection.
When the PDF creation is over, shut the browser reference to
Construct an API to generate and reply PDFs from URLs
With the data we collect up to now, we will now create a brand new endpoint that may obtain a URL as a question string, after which it is going to stream again to the shopper the generated PDF.
Right here is the code:
In the event you begin the server and go to the
goal question param containing the URL we wish to convert. The server will serve the generated PDF immediately with out ever storing it on disk.
Which is able to generate the next PDF because it appears to be like on the picture:
That’s it! You might have accomplished the conversion of an internet web page to PDF. Wasn’t that simple?
As talked about, Puppeteer provides many customization choices, so be sure you mess around with the alternatives to get totally different outcomes.
Subsequent, we will change the viewport dimension to seize web sites underneath totally different resolutions.
Seize web sites with totally different viewports
Within the beforehand created PDF, we didn’t specify the viewport dimension for the net web page Puppeteer is visiting, as an alternative used the default viewport dimension, 800×600px.
Nonetheless, we will exactly set the web page’s viewport dimension earlier than crawling the web page.
In at the moment’s tutorial, we used Puppeteer, a Node API for headless Chrome, to generate a PDF of a given internet web page. Because you at the moment are conversant in the fundamentals of Puppeteer, you need to use this information sooner or later to create PDFs and even for different functions like internet scraping and UI testing.
This article was initially revealed on Stay Code Stream by Juan Cruz Martinez (twitter: @bajcmartinez), founder and writer of Stay Code Stream, entrepreneur, developer, writer, speaker, and doer of issues.
Stay Code Stream can also be accessible as a free weekly e-newsletter. Join updates on every thing associated to programming, AI, and laptop science on the whole.