Ubuntu – How to search a PDF file from command line

bashcommand linepdfpdfgrepsearch

I'm checking references of a paper and want to find the positions in the body where some paper (in the references list) is referenced. Is there a PDF editor which supports search via bash and extract the context of a position? I've tried to convert PDF to text, but didn't have a good result.

Best Answer

  • Just to add to the above answer, in particular you can use a command line tool from xpdf-utils called pdftotext and then search the text document created by this tool with grep.

    This might look something like this:

    pdftotext document.pdf - | grep -C5 -n -i "search term"
    

    There is more information in the manual. The only drawback to pdftotext is that you can't us globbing to transform multiple files at the same time. This problem can be overcome with a small bash script:

    for f in pdf_directory; do echo $f; pdftotext $f - | grep -i "search_term"; done
    

    If you are having problems creating a text document from a pdf due to, for example, an incompatible pdf file, then that is another problem.

    I think in general, pdf editors don't include command lines because they are graphical. If you want to use bash, (or maybe zsh!) then you might have to use a terminal shell.

    Good luck!

  • Related Question