Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Greetings fellow Monks!

I have been tasked with creating a CSV file that will be imported into a new document archiving system, and after a lot of hacking, I have turned a 100MB text file into a file just under 1MB in size after removing the non-needed parts and getting it formatted. Below is a sample of what I am currently working with:

1401,PERISH STORE INV,Quarterly,(V) INV810 USER NAME (EDIT REPORT IF 0 +5309 PRSH INV),EXTRACT WAS RUN),(V) INV820 USER NAME (PERISHABLE INVE +NTORY REPORT),(VTD8) D:\DEPT\ACCT\MAIL\INV820.DAT,(V) INV820C USER NA +ME (PERISHABLE INVENTORY-BAKERY),(QUARTERLY RUN ONLY) 1402,PERSH INV BOOKS,Quarterly,(V) INV805 58 COPIES - USER NAME,(V) IN +V805 2 COPIES - USER NAME,(V) INV805A XTRA COPIES-SAVE IN COMPUTER RO +OM,ANNUAL STORE INVENTORY ONLY:,(V) INV805 58 COPIES - USER NAME,(V) +INV805 2 COPIES - USER NAME,(V) INV805A XTRA COPIES-SAVE IN COMPUTER +ROOM 1403,BAKERY INV BOOKS,Quarterly,(V) INV805 35 COPIES - USER NAME,(V) I +NV805A 5 COPIES - SAVE IN COMPUTER ROOM,ANNUAL STORE INVENTORY ONLY:, +(J) INV805 35 COPIES - USER NAME,(J) INV805A 5 COPIES - SAVE IN COMPU +TER ROOM 1501,INV PRICE GUIDE,As Needed,(V) MSC052 BOTH COPIES ARE TO BE PUT IN + BINDERS AND,AND LEFT ON THE CABINETS BEHIND THE DP,SECRETARY'S DESK

What I need to accomplish from the above is to break down the fourth field and beyond so that each one is on its own line, along with the first three lines, so the first line would look something like:

1401,PERISH STORE INV,Quarterly,(V) INV810 USER NAME (EDIT REPORT IF 0 +5309 PRSH INV) 1401,PERISH STORE INV,Quarterly,(EXTRACT WAS RUN) 1401,PERISH STORE INV,Quarterly,(V) INV820 USER NAME (PERISHABLE INVEN +TORY REPORT) 1401,PERISH STORE INV,Quarterly,(VTD8) D:\DEPT\ACCT\MAIL\INV820.DAT 1401,PERISH STORE INV,Quarterly,(V) INV820C USER NAME (PERISHABLE INVE +NTORY-BAKERY)(QUARTERLY RUN ONLY)

As you can see, what makes this a challenge is that each line is a different size depending upon how many reports there are to be distributed, as well as the number of people that the reports go to.

My plan is to take each line of the csv file and split it into an array with the 4th element being a reference to another array which whould contain all of the remaining lines. How would I go about creating the output csv file to look like my desired output? My code so far is below:

#!C:\Perl\bin\perl use strict; use warnings; open INPUT "input.csv"||die "Can not open: $!\n"; open OUTPUT "output.csv"||die "Can not open: $!\n"; while(<INPUT>){ my @line = split /,/; my @line2 = @line[0..2]; my @dist = @line[3..$#line]; my $ref = \@dist; push @line2,$ref; ## Print to OUTPUT } close INPUT; close OUTPUT;

TStanley
--------
People sleep peaceably in their beds at night only because rough men stand ready to do violence on their behalf. -- George Orwell

In reply to Generating an output file by TStanley

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-03-28 17:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found