Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I have the following report:
^LSTORE 001 =============== DEPT: PRODUCE EXTEND + MRKDWN REASON EXT. MRKDWN ITEM DESCRIPTION SIZE QTY WGT RETAIL RETAIL + RETAIL CD DESCRIPTION LOSS VENDOR 0300008 28OZ FRT PLATTER/DIP 00028OZ 1 0.0 8.99 8.99 + 8.99 01 DAMAGED/UNSALEABLE 0.00 102827 0080948 EXPRESS FANCY GREENS 00007OZ 6 0.0 2.99 17.94 + 17.94 01 DAMAGED/UNSALEABLE 0.00 103128 0321855 CLAMSHL HYDRO BOSTON 00COUNT 1 0.0 1.99 1.99 + 1.99 01 DAMAGED/UNSALEABLE 0.00 104040 0058309 12OZ MONTEREY MROOM 00012OZ 1 0.0 2.29 2.29 + 2.29 01 DAMAGED/UNSALEABLE 0.00 105524 0058309 12OZ MONTEREY MROOM 00012OZ 1 0.0 2.29 2.29 + 2.29 01 DAMAGED/UNSALEABLE 0.00 105524 0084448 10OZ SPINACH PACK 12 00010OZ 1 0.0 1.69 1.69 + 1.69 01 DAMAGED/UNSALEABLE 0.00 107505 REASON CODE TOTAL: + 35.19 0.00 DEPT TOTALS: + 35.19 0.00 ^LSTORE 002 =============== DEPT: PRODUCE 0084508 2LB STRAWBERRIES 00002LB 20 0.0 3.69 73.80 + 73.80 01 DAMAGED/UNSALEABLE 0.00 101224 DEPT: PRODUCE EXTEND + MRKDWN REASON EXT. MRKDWN ITEM DESCRIPTION SIZE QTY WGT RETAIL RETAIL + RETAIL CD DESCRIPTION LOSS VENDOR
What I am trying to get rid of, is the second or more occurences of the department name. The first instance after the store name is fine, but the others are not. My thought is to split the file into records based on the form feed character, then work on each record. So the initial code would be:
#!/usr/bin/perl -w use strict; open my $IN,"<","ISC001" or die "Can not open ISC001: $!\n"; open my $OUT,">","ISC-OUT2" or die "Can not open ISC-OUT2: $!\n"; $/="^L"; while(<$IN>){ ...... } close $IN; close $OUT;
One thing consistent through the file is that the occurences of the department name that I need to remove occur before the header lines, so I'm guessing a regex similar to:
$_ =~s|DEPT:\s+PRODUCE\n{2,}(\s{63}EXTEND\s+MRKDWN\s+REASON\s+EXT. MR +KDWN\n)|$1|g;
would do what I need. Am I heading in the right direction with this guess, or am I going in the wrong direction?

TStanley
--------
People sleep peaceably in their beds at night only because rough men stand ready to do violence on their behalf. -- George Orwell

In reply to Keeping the first occurence of a pattern, and removing the other occurences by TStanley

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (2)
As of 2024-04-19 20:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found