Using exec to reduce PHP memory use

As mentioned in a previous post, I’ve recently started to use a library called dompdf to convert HTML to PDF using PHP. One of the major problems I came across was the amount of memory used by this library, which was over 16MB for some files. Initially I generated each file using a call to a function with arguments which described the type of PHP file I wanted to create. The function broadly worked as such:

  1. Select information from database.
  2. Assign information to Smarty template.
  3. Parse Smarty template to produce a string of HTML.
  4. Create a dompdf object with the HTML string as its input.
  5. Generate the PDF and save the file to disk.
  6. Exit the function.

I erroneously assumed that PHP would perform some garbage collection when exiting the function (i.e. when the object fell out of scope), thus freeing up the memory used by the dompdf object. Unfortunately, PHP didn’t do this, resulting in the script using more and more memory each time I called the function, which eventually broke through the limit set for individual PHP scripts and so execution was halted. As the PDF files differed in size, I couldn’t guarantee when this problem would happen and allow for it.

In order to get round the memory use problem, I tried using the unset function to free up the memory used by the dompdf object. However, unset merely removes the reference to an object, and does not force the garbage collector to free up the memory immediately. As a result, the memory barrier was still being broken at an undetermined point.

Finally, I came across a somewhat forceful but successful method of getting around the problem. Instead of putting the PDF generation code in a function, I moved it into a separate PHP file and then executed this file within my main script using the exec function like so:

exec('/usr/bin/php /path/to/GeneratePDF.php');

Instead of using 16MB for each function call and not freeing the memory, therefore using 160MB for 10 calls, PHP freed up all the memory used by the separate script when it finished executing, so at any one point there was never more than 16MB in total being used. I benchmarked both methods (function calls and exec) by trying to create a number of PDF files, and the results were quite impressive: using the function calls took over 60 seconds and several of the files failed, using exec took 30-40 seconds and every file was generated successfully.

Normally I would not recommend using functions which execute external programs, due to the security implications involved. However, in this case no user-supplied data is passed to exec, and the huge improvement in speed and memory use makes it a no-brainer for this particular example.

Further information

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.