http://qs321.pair.com?node_id=887640


in reply to Need Help for Convert PDF to HTML

Another difficulty I do not see listed among the replies here is the issue of embedded fonts. PDF documents allow for embedding of fonts, and HTML does not. If usage of non-standard (non-web) fonts is embedded in the source PDF, then extraction of the font becomes a significant challenge. Some tools are available to do just that. CAM::PDF can Extract Font Info from PDF, but when brian_d_foy asked about extracting the fonts themselves Chris Dolan intends to never add that feature.

If you happen to have the font, that may be easier. It really depends on your source PDF document.

CSS can be used to specify such fonts (see FontSpring "Bulletproof" Method, Smiley Variation among many).

There are also licensing issues in play for many fonts. Depending on your circumstances (and perhaps the font requirements) this may be of concern/interest to you.