Linux – HTML/PDF to DOC(X) in Linux command line

bashdocumentshtmllinuxpdf

I need to convert PDF or HTML+CSS into DOC or DOCX under Linux, it can be from the command line or with a scripting language.

Any idea?

Best Answer

I've just stumbled on this question and after a bit more googling, found pandoc: http://johnmacfarlane.net/pandoc/README.html

A simple command will create a docx or pdf (or rtf etc) file from html input like so:

pandoc -o output.docx input.html

It can also write to stdout (with some formats) and read from stdin.

Not sure if it's in the debian / ubuntu repos but it's in the epel 6 repo for Redhat / CentOS 6 ( yum install pandoc ).

Hope this helps someone :)