go ahead... be a heretic | |
PerlMonks |
Re: Build a PDF book indexby LanX (Saint) |
on Mar 17, 2018 at 11:49 UTC ( [id://1211125]=note: print w/replies, xml ) | Need Help?? |
Tl;dr, but > I've noticed that some characters aren't as expected when extracted: PDF allows to embed it's own fonts, and the encoding of characters is sometimes random then. You can solve it for a specific PDF document only by scanning the affected font number and manually building a translation table into a hash. HTH! :)
Cheers Rolf
In Section
Seekers of Perl Wisdom
|
|