Back in 2010 I’d been looking for a way to generate PDF files dynamically on a Web site mainly to create files which can be printed or emailed and will work on any computer (printer-friendly pages with CSS didn’t quite achieve the effect I wanted). The requirements included:
- A sensible layout system which would automatically put elements in the correct place — I don’t want to have to manually specify where the text should begin on the page, align each paragraph, trigger a new page etc.
- The ability to transform HTML into PDF, so that I can use Smarty to produce the PDF files as well as the Web pages.
- Support for as much CSS as possible.
- Active mailing list or other method of getting help from the community.
- Software under active development, i.e. not one of these Sourceforge projects which were started in 2005 and haven’t been updated since.
- Uses a licence which enables the library to be integrated with closed-source software (e.g. BSD or LGPL).
There are several HTML to PDF tools out there, but most of them are not up to the task. HTMLDOC looks promising at first, until you realise that it only supports part of HTML 4 and doesn’t support CSS at all — at least not according to the FAQ. ReportLab also raised my hopes, despite being written in Python (which I’m not familiar with), but a markup language is only available with the commercial version. HTML 2 PDF didn’t even get off the drawing board by the looks of things, and hasn’t been updated since 2005. Finally I came across dompdf, which seems to fit the bill with the following features:
- Supports conversion from HTML to PDF.
- Support for a large chunk of CSS (imperfect but improving).
- Active mailing list, which includes the developers of the software.
- Licensed under the LGPL.
Using dompdf at its most basic is a doddle, you simply pass the HTML in as a parameter to the
load_html function and choose whether you wish to stream the result to the browser or output to a file. The bare minimum code is shown below:
$html = $smarty->fetch('template.tpl'); $filename = '/path/to/file.pdf'; $dompdf = new DOMPDF(); $dompdf->load_html($html); $dompdf->set_paper('a4', 'portrait'); $dompdf->render(); file_put_contents($filename, $dompdf->output());
$smarty is a reference to an existing Smarty template, this will create a PDF which should look more or less the same as the Web page generated by
$smarty->display('template.tpl'). There are other options which allow you greater control over the final PDF, such as altering the page size, loading extra fonts etc., but the above code will produce a working PDF which you can email or print. The size of the PDF is also extremely small — the invoices which I have been working on come in at around 2-3KB each.
Things to watch out for include:
- File permissions — as you’re saving files to disk the Web server user (www-data if you’re running Apache on Debian) will need write access to the relevant directory.
- Databases — do not store the files in a database. By all means store the meta data, such as filename, last modified time etc., in a database (I do this for ease of management), but don’t store the file itself. I’ll write another post at a later date detailing why storing files in a database is a bad idea.
- Unicode — unfortunately dompdf doesn’t have full Unicode support yet, so if you want to create documents with this character set you will have to wait a while. I believe it’s possible to make dompdf work with Unicode by using the commercial version of pdflib, but I haven’t tried that myself.
dompdf isn’t perfect of course, it takes a long time to generate PDF files with tables, some CSS rules don’t work and ordered lists are currently unsupported. However, it is under active development, and there have been performance improvements in recent versions. At the moment I have to generate PDFs as part of a cron job instead of on the fly, and employ a bit of a hack to get round memory usage problems, but I expect those problems to gradually diminish over time. Even with these minor blemishes, dompdf is still the best library I’ve found for converting HTML to PDF in PHP.