Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Hokay...I have a potential use case for myself, but first, a comment:

I don't want to have to write XML.

I have a structured document that I produce. Presently, I use a C program I inherited with the job that produces custom PostScript. Unfortunately, the full document has some front and back matter (including a table of contents) that have to be generated by hand and merged together.

The Preview tool in MacOS X will convert my PostScript into PDF and leave me with a PDF with searchable and selectable text. What I'd like to be able to do (and did some preliminary poking around toward) is to directly generate a PDF, ideally with a real ToC that links to the appropriate places.

The current C program treats the body of the document as having a nested structure of bits as it sets the type. The bits of stuff are:

  • Page
    • Column
      • Section
        • Row
          • Cell
            • Paragraph
              • Line
                • Word
I'm doing this off the top of my head, so I may have left out a level of detail here. Words are the smallest unit of stuff to set, being nominally indivisible strings.

The input data gets grouped into Section chunks. When a Section is fully populated with the text and its formatting, the Section is poured into the Column, leaving a rump Section when it doesn't all fit. Keep pouring into Columns as necessary. When a Page fills up, the PostScript gets generated and sent to the output filehandle. Sections that break across a Column have a continuation header on subsequent columns, and Pages have headers in the manner of dictionaries.

Typeface stuff is applied at the Word level. Each level includes positioning data that is relative to its container, and the higher-level elements have margins and sizes (being nested rectangles).

I have a multi-tiered template in mind -- and I suspect that it could be expressed in CSS terms as well, just to confuse matters. At one point, I had worked up an XML-ish representation of the document but didn't go far with it before my attention span ran out.


In reply to Re: PDF::Template redesign - I want your ideas! by herveus
in thread PDF::Template redesign - I want your ideas! by dragonchild

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2021-10-28 18:59 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (96 votes). Check out past polls.