Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Page Break Characters

by blacksmith (Hermit)
on Sep 06, 2001 at 00:31 UTC ( [id://110415]=perlquestion: print w/replies, xml ) Need Help??

blacksmith has asked for the wisdom of the Perl Monks concerning the following question:

I do not know if my title is a good one, or if I am even going to be able to explain myself here, but here goes. I am trying to work with .TXT (ASCII) files. These files have a page break in them which looks like a "1" when I retrieve them after being place on a network directory from our mainframe. I know that this has to be the page break since I never see the "1" when the file is printed by the mainframe and also because I just know where the pages end and begin. I was wondering if anyone else has any experience with this or could point me in the right direction. I have seen many materials on control characters, yet I have not found any thing specific on this. (Or maybe I just overlooked it). If anyone needs to see a better example of this, feel free to /msg me and I will attempt to send a better example. For now here is about the best I can do considering :
1************************************** * * * JOB IT1234-A * REPORT ID-000 * TITLE REPORT FROM O/S390 *

The page break is at the "1" in the top left hand corner. I am also trying to use Create::PDF to turn these files into .PDF format if that helps any. Thanks.
Blacksmith.

Replies are listed 'Best First'.
(Ovid) Re: Page Break Characters
by Ovid (Cardinal) on Sep 06, 2001 at 00:46 UTC

    Mainframes often used fixed-width records and reports are actually files with (typically) 133 characters per record. However, the report itself is only 132 characters wide. The first character is called the "Carriage Control" and this tells the printer how to behave, but itself is not printed. The following table lists the carriage control codes you're likely to encounter:

    CC Meaning
    1 Advance to the top of the next page
    + Advance zero lines (overprint)
    blank Advance 1 line (note that most of your lines have a space for the first character)
    0 Advance 2 lines
    - Advance 3 lines

    If you want to replicate the structure of the report, you'll need to take these into account.

    Cheers,
    Ovid

    Vote for paco!

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

(tye)Re: Page Break Characters
by tye (Sage) on Sep 06, 2001 at 00:42 UTC

    See, for example, http://www.nsc.liu.se/~boein/f77to90/asa.html:

    ASA Carriage Control Characters

    The first character in each record (line) of formatted output defines a carriage control character to the lineprinter, often called ASA-character, for a previous name of the US Standardization Organization ANSI. These have the following meanings

            blank     New line
            +         Not new line (overwriting, 
                      not available on all printers)
            0         Double line feed
            1         New page
    

    That doesn't look like a "1", it is a "1". You need to strip the first character off from each line and use it to determine what type of CR and/or LF you should put before/after that record.

            - tye (but my friends call me "Tye")
Re: Page Break Characters
by jlongino (Parson) on Sep 06, 2001 at 01:45 UTC
    This reply is off topic but FYI:

    For us "old-timers" this was better known as CTLASA. Where CTL is for ConTroL and ASA the American Standards Association (now known as ANSI -- American National Standards Institute). CTLASA may turn up additional info if you should need to research the topic further.

    @a=split??,'just lose the ego and get involved!';
    for(split??,'afqtw{|~'){print $a[ord($_)-97]}
Re: Page Break Characters
by tachyon (Chancellor) on Sep 06, 2001 at 00:52 UTC

    I assume you want to ditch this page break char. First you need to identify it, then you just sub it out. Here is a snippet to identify the chars in a string according to their octal escape code. Just cut and paste a snippet of your text that includes the escape char or similar to get a sample of it into a variable. In the example you can see the two "\n" chars - they are \012 in octal. Anyway once you identify the char code for this errant control char just stick it in the regex and it should be fixed.

    $text = "Hello\nWorld\n"; for (split//,$text) { printf "\%03o\n", ord $_; } $text =~ s/\012/\n\n\n/; print $text;

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Page Break Characters
by blacksmith (Hermit) on Sep 07, 2001 at 01:10 UTC
    Thanks to all of you monks. Very informative responses. Helped greatly.

    Blacksmith.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://110415]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-28 14:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found