I have a pdf file which is Persian script and it is a Right-to-Left.
Since Persian uses UTF-8 format therefore I can't convert it into a plain text in Microsoft Word, also I can't copy-paste the text resulting unreadable characters.
I have tried a lot of softwares such as unipdf and e-Pdf Converter however after the conversion still the characters are not displayed properly.
I even tried OCR but again same problem appeared.
The pdf does'nt have any password or restrictions.

Does anyone have any other ideas?

Edit: I actually tried creating a file in MS Word and converting it to a PDF, after that again I had the same problem with the PDF file.(even the encoding was known)

Best Answer

I had the same problem with converting pdf files to word. After copy/paste in Word, the formatting changed and caused trouble. I tried several online converters but they also failed.
The only method that worked was as follows:

  1. Open the pdf file with Adobe Acrobat Reader, then from the file menu choose print. From the printer names, choose adobe acrobat. Yes, you are about to create a pdf from a pdf!
  2. Open the new pdf file with Google Chrome (drag and drop the file onto Chrome).
  3. Now simply select all the text (ctrl + A) and copy/paste it into a blank Word file.
