good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
>
Any other ideas?
See update of Parsing PDFs by text position? and linked threads > nothing had worked What does this exactly mean? If pdftohtml -xml doesn't produce readable text, your only remaining chance is OCR, because the PDF might embed its own font in random order or even only an image showing the text.
Cheers Rolf
In reply to Re: Converting PDF file to text
by LanX
|
|