Is it possible to remove ligatures from copied text


I have a few PDFs that contain ligatures in the text (e.g., ff is combined into a single character, ).

Is there an easy way to remove them when copying the text from the PDF? (i.e., when I paste, I'd like the to be pasted as ff).

I copy a lot of text from these PDFs into answers on Stack Overflow and I find the ligatures at best obnoxious (ok, I admit, I'm really picky :-P); the ligatures also do not show up correctly when copied into other places (e.g., if I copy them into Notepad, they show up as blocks).

I cannot modify the PDFs.

I use both Adobe Acrobat Reader and Foxit Reader, but I'd be open to trying a new PDF reader.

Best Answer

The reader evince seems to decode ligatures when I tested this.

Btw. for pdflatex documents you can use this in the preamble to display ligatures in the PDF document but copy individual characters:

\pdfgentounicode=1 %