comment on

I have the following report:

^LSTORE 001
===============
DEPT:    PRODUCE
                                                               EXTEND 
+ MRKDWN   REASON                 EXT.  MRKDWN
ITEM    DESCRIPTION                SIZE      QTY   WGT  RETAIL RETAIL 
+ RETAIL   CD DESCRIPTION                 LOSS  VENDOR

0300008 28OZ FRT PLATTER/DIP      00028OZ     1    0.0    8.99   8.99 
+   8.99   01 DAMAGED/UNSALEABLE          0.00  102827 
0080948 EXPRESS FANCY GREENS      00007OZ     6    0.0    2.99  17.94 
+  17.94   01 DAMAGED/UNSALEABLE          0.00  103128 
0321855 CLAMSHL HYDRO BOSTON      00COUNT     1    0.0    1.99   1.99 
+   1.99   01 DAMAGED/UNSALEABLE          0.00  104040 
0058309 12OZ MONTEREY MROOM       00012OZ     1    0.0    2.29   2.29 
+   2.29   01 DAMAGED/UNSALEABLE          0.00  105524 
0058309 12OZ MONTEREY MROOM       00012OZ     1    0.0    2.29   2.29 
+   2.29   01 DAMAGED/UNSALEABLE          0.00  105524 
0084448 10OZ SPINACH PACK 12      00010OZ     1    0.0    1.69   1.69 
+   1.69   01 DAMAGED/UNSALEABLE          0.00  107505 

        REASON CODE TOTAL:                                            
+  35.19                                  0.00

              DEPT TOTALS:                                            
+  35.19                                  0.00


^LSTORE 002
===============
DEPT:    PRODUCE

0084508 2LB STRAWBERRIES          00002LB    20    0.0    3.69  73.80 
+  73.80   01 DAMAGED/UNSALEABLE          0.00  101224     



DEPT:    PRODUCE

                                                               EXTEND 
+ MRKDWN   REASON                  EXT. MRKDWN
ITEM  DESCRIPTION                   SIZE    QTY    WGT  RETAIL RETAIL 
+ RETAIL   CD DESCRIPTION                LOSS   VENDOR
[download]

What I am trying to get rid of, is the second or more occurences of the department name. The first instance after the store name is fine, but the others are not. My thought is to split the file into records based on the form feed character, then work on each record. So the initial code would be:

#!/usr/bin/perl -w
use strict;

open my $IN,"<","ISC001" or die "Can not open ISC001: $!\n";
open my $OUT,">","ISC-OUT2" or die "Can not open ISC-OUT2: $!\n";

$/="^L";

while(<$IN>){
  ......
}

close $IN;
close $OUT;
[download]

One thing consistent through the file is that the occurences of the department name that I need to remove occur before the header lines, so I'm guessing a regex similar to:

$_ =~s|DEPT:\s+PRODUCE\n{2,}(\s{63}EXTEND\s+MRKDWN\s+REASON\s+EXT.  MR
+KDWN\n)|$1|g;
[download]

would do what I need. Am I heading in the right direction with this guess, or am I going in the wrong direction?

TStanley
--------
People sleep peaceably in their beds at night only because rough men stand ready to do violence on their behalf. -- George Orwell

In reply to Keeping the first occurence of a pattern, and removing the other occurences by TStanley

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


There's more than one way to do things
	PerlMonks