Comparing Python PDF Generation Options

At at least one point in your programming career, you’ll be asked to generate a PDF. Maybe receipts are legally required to be in PDF format, or you need to send something to a printer, or most commonly, many users just expect PDF reports.

So you search Google for “how to generate a PDF with Python.” You’ll find three options:

PyFPDF, or similar build-a-pdf-line-by-line open-source libraries
Python-pdfkit, or similar HTML-to-PDF browser-based open-source libraries
Commercial engines, like DocRaptor or PDFreactor

What are the differences? Advantages and disadvantages? How do you choose? Let me explain.

Build a PDF Line-By-Line

Here’s the default code example for PyFPDF:

from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.set_font('Arial', 'B', 16)
pdf.cell(40, 10, 'Hello World!')
pdf.output('tuto1.pdf', 'F')

For many use cases, this is ideal. You’ll get the exact PDF you want, every time. But as you can see, it requires building the PDF object by object, line by line.

It’s common to already a document written in HTML, or perhaps as a developer you’re most familiar with building frontend code with HTML and CSS. In this case, you probably want an HTML to PDF solution.

Browser-Based Engines

Alternatively, there are many HTML to PDF libraries based on various browsers. Headless Chrome is extremely popular these days (as it should be), but there are many older tools built on PhantomJS and wkhtmltopdf (don’t use these; they rely on ancient webkit engines).

These libraries are generally good for simple PDF documents, but they tend to break down under complex documents with more than one page or pixel-perfect design and layout requirements. This is because browsers are based on the concept of a single continuously-scrolling website page. They don’t understand “pages” at all.

pdfkit’s has a really simple interface:

import pdfkit

pdfkit.from_url('http://google.com', 'out.pdf')
pdfkit.from_file('test.html', 'out.pdf')
pdfkit.from_string('Hello!', 'out.pdf')

pychromepdf lets you use the more modern Headless Chrome generator, but it’s more complicated to set up and maintain.

Commercial Engines

Finally, as you can probably guess, the commercial PDF generators offer the most advanced functionality. That functionality comes at a price. A PDFreactor license starts at $2,500 and PrinceXML $3,800. DocRaptor’s online HTML to PDF API provides access to Prince’s engine starting at a more affordable $15/mo.

But these HTML to PDF libraries come with features like:

Dynamic table of contents
CSS-based headers and footers
Different page styles and backgrounds for different sections of the document
Flexbox support
Advanced page break handling
Watermarks
Accessible PDFs

DocRaptor’s Python library is as simple as:

import docraptor

doc_api = docraptor.DocApi()
doc_api.api_client.configuration.username = ‘YOUR_API_KEY_HERE’

response = doc_api.create_doc({
  "test": True,
  "document_content": "<html><body>Hello World</body></html>",
  # "document_url": "http://docraptor.com/examples/invoice.html,
  "document_type": "pdf",
  # "javascript": True,
})

So which is best?

That’s up to you! With a larger budget, you can support more complex PDFs and complete your project much faster. With a simpler document, maybe the open-source libraries will save you money (depending on how much work you have to do on the infrastructure side).

If the document hasn’t been created yet, PyPFDF’s pixel perfection may be the best route. The choice is yours.

Alien Coders

Comparing Python PDF Generation Options

Build a PDF Line-By-Line

Browser-Based Engines

Commercial Engines

But these HTML to PDF libraries come with features like:

DocRaptor’s Python library is as simple as:

So which is best?

About The Author

Jassi

Share your ThoughtsCancel reply

Build a PDF Line-By-Line

Browser-Based Engines

Commercial Engines

But these HTML to PDF libraries come with features like:

DocRaptor’s Python library is as simple as:

So which is best?

Share this:

You may also like

About The Author

Share your ThoughtsCancel reply