Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Separating records

by johngg (Canon)
on Sep 22, 2010 at 22:08 UTC ( [id://861380]=note: print w/replies, xml ) Need Help??


in reply to Separating records

An alternative approach (if the data file is not too large) would be to slurp the whole data file into a scalar string in memory then use split to break it into records at the point between a newline and five digits. You can then use grep and tr (see Transliteration in Quote and Quote like Operators) to extract only those records that span more than one line, i.e. tr counts more than one newline.

use strict; use warnings; my $data = do { local $/; <DATA>; }; my @records = split m{(?<=\n)(?=\d{5}\D)}, $data; my @goodRecords = grep { tr{\n}{} > 1 } @records; print @goodRecords; __END__ 00210 SHIFT PAY PRIV SYSOU 00211 SV-PROG OS SAVE 00215 OS MIGRATE SAVE 00217 DEM OS SUPER SAVE1 00219 DEM OS SUPER SAVE2 00221 DEM OS SUPER SAVE3 00901 DSDFIL 01401 PERISH STORE INV (V) INV810 BOB FERRANTE (EDIT REPORT IF 05309 PRSH INV) EXTRACT WAS RUN) (V) INV820 BOB FERRANTE (PERISHABLE INVENTORY REPORT) (VTD8) D:\DEPT\ACCT\MAIL\INV820.DAT (V) INV820C DIANE CALLAHAN (PERISHABLE INVENTORY-BAKERY) (QUARTERLY RUN ONLY) 01402 PERSH INV BOOKS (V) INV805 58 COPIES - BOB FERRANTE (V) INV805 2 COPIES - JIM MIAMIS (V) INV805A XTRA COPIES-SAVE IN COMPUTER ROOM ANNUAL STORE INVENTORY ONLY: (V) INV805 58 COPIES - USER (V) INV805 2 COPIES - USER (V) INV805A XTRA COPIES-SAVE IN COMPUTER ROOM 01403 BAKERY INV BOOKS (V) INV805 35 COPIES - USER (V) INV805A 5 COPIES - SAVE IN COMPUTER ROOM ANNUAL STORE INVENTORY ONLY: (J) INV805 35 COPIES - USER (J) INV805A 5 COPIES - SAVE IN COMPUTER ROOM 01405 PRSH INV. EXTRACT (V) MSI000 OPERATIONS DOCUMENTATION (MSI DUMP LISTING) 01501 INV PRICE GUIDE 01502 INV SLOTBOOK (V) CIO102 OPERATIONS SUPERVISOR (2 COPIES) (2 COPIES-IN BINDERS AND LEAVE WITH CODERS) PRICE GUIDES 01503 INV-DUPS THE FOLLOWING OUTPUT WILL ONLY BE PRODUCED IF DUPLICATE SLOTS ARE FOUND. (V) INV900 USER (V) INV969 USER

The output.

01401 PERISH STORE INV (V) INV810 BOB FERRANTE (EDIT REPORT IF 05309 PRSH INV) EXTRACT WAS RUN) (V) INV820 BOB FERRANTE (PERISHABLE INVENTORY REPORT) (VTD8) D:\DEPT\ACCT\MAIL\INV820.DAT (V) INV820C DIANE CALLAHAN (PERISHABLE INVENTORY-BAKERY) (QUARTERLY RUN ONLY) 01402 PERSH INV BOOKS (V) INV805 58 COPIES - BOB FERRANTE (V) INV805 2 COPIES - JIM MIAMIS (V) INV805A XTRA COPIES-SAVE IN COMPUTER ROOM ANNUAL STORE INVENTORY ONLY: (V) INV805 58 COPIES - USER (V) INV805 2 COPIES - USER (V) INV805A XTRA COPIES-SAVE IN COMPUTER ROOM 01403 BAKERY INV BOOKS (V) INV805 35 COPIES - USER (V) INV805A 5 COPIES - SAVE IN COMPUTER ROOM ANNUAL STORE INVENTORY ONLY: (J) INV805 35 COPIES - USER (J) INV805A 5 COPIES - SAVE IN COMPUTER ROOM 01405 PRSH INV. EXTRACT (V) MSI000 OPERATIONS DOCUMENTATION (MSI DUMP LISTING) 01502 INV SLOTBOOK (V) CIO102 OPERATIONS SUPERVISOR (2 COPIES) (2 COPIES-IN BINDERS AND LEAVE WITH CODERS) PRICE GUIDES 01503 INV-DUPS THE FOLLOWING OUTPUT WILL ONLY BE PRODUCED IF DUPLICATE SLOTS ARE FOUND. (V) INV900 USER (V) INV969 USER

I hope this is useful.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: Separating records
by TStanley (Canon) on Sep 23, 2010 at 14:45 UTC
    EXCELLENT!! Thank you very much!

    TStanley
    --------
    People sleep peaceably in their beds at night only because rough men stand ready to do violence on their behalf. -- George Orwell

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://861380]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-23 07:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found