Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Preserving layout in pdf to text or html to text conversion

by cbrandtbuffalo (Deacon)
on Apr 10, 2007 at 20:11 UTC ( [id://609248]=note: print w/replies, xml ) Need Help??


in reply to Preserving layout in pdf to text or html to text conversion

Maybe this is obvious, but if you're going to try to add some of this functionality yourself, consider subclassing or otherwise building on one of the existing parser modules you mentioned. If you can use the existing module to do most of the work, you could focus on processing the DIV tags and the information in them. You need to find a parser module that keeps the CSS info rather than immediately throwing it out.
  • Comment on Re: Preserving layout in pdf to text or html to text conversion

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://609248]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (8)
As of 2024-04-18 03:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found