http://qs321.pair.com?node_id=766929


in reply to Re^2: Supressing page breaks in forms
in thread Supressing page breaks in format/write output

<Update: Root Node has been revised since this was posted /update>

Let me repeat: web pages do NOT have intrinsic page breaks. When printed to dead trees, the print routine inserts them, but when viewed on a monitor they have no need to paginate.

Think of the rendered web page as appearing on a long (very long!) roll of paper, with each end attached to a roller, viewed through a window frame. Scrolling moves the paper (use "vertically" for your mental model); changes which portion of the text you can see. Web pages are NOT analagous to cut sheets of paper, be those 8.5x11, A4, Legal, or 6 inches wide by a mile high.

So, assuming your "nicely formatted text files" are, indeed, pure text and not some form of word-processing format, you need do no more than this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http:/ +/www.w3.org/TR/html4/loose.dtd"> <html lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title> (provide a title) </title> <meta name="description" content="##provide a narrative description of + the content##"> <meta name="keywords" content="##provide keywords (or phrases, enclose +d in single quotes) separated by commas##"> </head> <body> <pre> **Insert your "nicely formatted text" here assuming that "nicely forma +tted" means line_length not too much more than 80 chars; while many m +onitors are much wider, aim for the lowest common denominator.** <pre> </body> </html>

You can practically use the content above as an element of a home-brew templating system. The step-by-step below is sub-optimal, but should work until you learn more about html, perl, and the templating systems cited in previous responses.

  1. Save the template above to a local file.
  2. Read (see open among others) the template into a variable in your script).
  3. Read the "nicely formatted text" into another variable.
  4. Use substitution (see perlretut) on the first variable to replace the content above between ** and ** with the contents of your second variable.
  5. Hand tweak the <meta....>s above, replacing the ##...## with appropriate content, or remove them before step one if you're not worried about search engine rankings.
  6. Write (see"open" above and perldoc -f print) the modified second variable to a file, name.htm.
  7. Move or copy the name.htm to a web server (if some viewers will be on-line) or simply distribute it to interested parties as an attachment to email, by sneaker net or whatever.

There will be no page breaks. And, FWIW, you can simply paste (remember, we're assuming your source files are truly pure "text," not word processor documents, .pdfs, Latex, or some other format using markup) the source document into an email to distribute it. Same deal: no page breaks.

If, however, your source file is a wp document, note that wp documents do, typically, contain page break codes. You may have to remove those. But you will also have a strip whatever markup the creator_application inserted to organize the tables and then recreate those with .html or even .xml markup (for which, you'll need to spend a few minutes learning the fundamentals).

Replies are listed 'Best First'.
Re^4: Supressing page breaks in forms
by yaconsult (Acolyte) on May 29, 2009 at 20:11 UTC
    It is NOT a web page. It is a simple text file. It gets emailed to people via a script.

    Coincidentally, there is a web page somewhere that, when a link is clicked on it, displays the contents of this text file in the browser.

    The only problem is that the text gets broken in "pages" when there are many lines and I don't want it to.

    ...
    2009-03-15 04 143 oconnorn
    ^L Date Hour Requests UID
    -----------------------------------------------
    2009-03-15 04 121 rdreyer
    ...

    That is what happens every so many lines - a ^L and a reprinting of the header. I just want those breaks to disappear.

      You've changed your spec: in your original post, you said, "(t)he form will be viewed using a browser...."

      Certainly it's possible to view a plain text file with a browser, by any one of several techniques, but IMO, if something is "viewed using a browser," it's a web page (or, to be a bit more precise, a page or constituent thereof). Further, within the context of "viewed in a browser," a form is, as noted above, a very specific html construct with a very specific purpose unrelated to what you didn't tell us until prodded by replies suggesting you clarify your intent.

      Given your new spec, you may wish to know that Control-L (represented by caret-L in many circumstances) is the (nominally) "non-printing" ASCII and ANSI character for a "Form Feed" AKA "page eject, 0x0c, 12 decimal. Update: see http://www.robelle.com/smugbook/ascii.html/update

      So the problem becomes "how can you remove each instance of 0x0c (and, I infer -- perhaps incorrectly -- the "Date Hour Requests UID" elements)?" (Your spec for "header" is a bit less than precise.) But either case is readily dealt with using a very basic script (which, at this stage, is left as an exercise for the student) or --indeed -- with a minimally competent text editor.

      Suggestion: when you update a post, please tag it prominently -- for example, Update....blah, blah, blah /update. You do acknowledge your addition to the OP, but that acknowledgment is less than prominent.

      <Update: You appear to have also changed your title, so it no longer matches the titles of the replies. That's also confusing, and the change is still less than clear; s/format\/write output/text/ would have been better. /update>