Ubuntu – Handle Doc/Docx Templates on a headless server to produce PDFs preferably without using OpenOffice.org

command lineconversionfile formatpdfserver

On a production web server I have to produce letters based on a template I got in MS-Word binary format. I use PHP and for the search and replace task I found PHPWord, which can handle Docx files, so I converted the template to OpenXML on my local workstation. Unfortunately the output also is Docx.

The goal is to produce a single PDF for the user to download so she can print out a bunch of letters at once very easily.

Now I need to find a way to either:

  • Search and replace text in a PDF file
  • Convert Docx to PDF without loss of formatting
  • Edit the original Doc template without loss of formatting and without using COM
  • Convert Docx to Doc without loss of formatting (which seems nearly impossible for the template looks good in word but technically how the formatting is done is a big pile of…) so I could convert it using wvPDF

What I don't want to use besides OpenOffice.org are web services. I'm aware of PHPLiveDocx but I don't want to depend on an external service for performance, availability, security reasons. Also buying a piece of software isn't an option in this case (can't influence that).

Running on a public facing web server I don't want to pull OpenOffice.org – not even headless, as it will pull around 160MB of compressed(!) binaries and best practice is not no load binaries you don't really need on a public facing server. Though it's a last resort to use oo.o I want to make sure I have ruled out any other options there may have been.

The host OS is CentOS 5.5.

Where can I go from here?


Best Answer

  • To my knowledge there is no application that can do this without some dependency from Libre Office.

    However you don't need to install the whole office suite when only performing commandline conversions.

    You can try if the tool unoconv Install unoconv meets your needs. It has python and python-uno as a dependency. The latter will also install libreoffice-core as a dependency but not the whole office suite.