Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Converting fixed record length files to pipe delimited

by akm2 (Scribe)
on Feb 19, 2001 at 20:07 UTC ( [id://59390]=perlquestion: print w/replies, xml ) Need Help??

akm2 has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: Converting fixed record length files to pipe delimited
by agoth (Chaplain) on Feb 19, 2001 at 20:26 UTC
    ditto the above comment, use code tags!!,

    One solution to your problem:

    • use unpack to get your data out into an array
    • slice the array to discard the values you dont want
    • join the array with pipes
    open (OUTFILE, >$file) or die; while (<FILE>) { my @ary = unpack('A35 A30 A15', $_); my @tmp = @ary[0..4]; print OUTFILE (join '|', @tmp); } close OUTFILE, FILE;
Re: Converting fixed record length files to pipe delimited
by davorg (Chancellor) on Feb 19, 2001 at 20:30 UTC

    I'd do it something like this (trying to reconstruct the spec by reverse engineering your script):

    open(INFILE, $file) or die "Can't open $file: $!\n"; open(OUTFILE, $file) or die "Can't open $outfile: $!\n"; # widths of the cols my @cols = qw(3 50 7 7 7 7 6 6 6 6 55 4 41 6 23 7 169); # build unpack format my $fmt = join '', map { "A$_" } @cols; # column names my @col_names = qw(mfg model spec1 spec2 spec3 spec4 opt1 opt2 opt3 opt4 desc qoh trash1 mult trash2 list trash3); while (<INFILE>) { my %rec; $rec{@col_names} = unpack $fmt, $_; next if $rec{model} = 'COMPONENT PARTS'; print OUTFILE join('|', $rec{@col_names}) }

    Which looks a bit simpler than your version :)

    --
    <http://www.dave.org.uk>

    "Perl makes the fun jobs fun
    and the boring jobs bearable" - me

Re: Converting fixed record length files to pipe delimited
by arturo (Vicar) on Feb 19, 2001 at 20:33 UTC

    Basic Technique: read the records in, trim them, then use perlfunc:join to generate the output form. Probably your best bet is to read in each line, put all the 'keeper' fields into an array, then loop through the array and print the joined array out to a file. Here's one, relatively easily grokkable way to do it:

    # for each line, # get fields you're keeping, put them into @fields # in the proper order foreach my $field (@fields) { $field =~ s/^\s*(.*?)\s*$/; # trim whitespace -- but beware! # two-command version (see the FAQ in perlfaq) # $field =~ s/^\s*//; # $field =~ s/\s*$//; } print OUTPUTFILEHANDLE join "|" @fields; # now process the next line

    HTH

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

(boo) Re: Converting fixed record length files to pipe delimited
by boo_radley (Parson) on Feb 19, 2001 at 20:45 UTC
    This the part unpack was born to play. It'll let you defile define field lengths, which you can stuff into an array. I'm taking a guess as to what the field names might be, and what the field lengths would be; it'd have been more useful to know those than to actually see the text ;)
    if you wanted just the first 5 fields, and rework the first, third and second :
    ($part_code, $part_size, $part_size2, $quantity, $cryptic_field) = unp +ack ("A54 A7 A7 A10 A5", $line_from_file); print OUT join ("|", ($partcode, $part_size2, $part_size));
    make sense?
Re: Converting fixed record length files to pipe delimited
by tadman (Prior) on Feb 19, 2001 at 20:45 UTC
    Yikes! That code didn't get wrapped for some reason, and it's throwing the navigation table way off kilter. Anyway.

    The code you posted is really Perl 4 style, with a whole whack of arrays instead of the Perl 5 style Array of Arrays (or AoA as you will hear more often). AoA is a much easier way to implement what you have done. Easier is better, no?

    I would define your input file format, first, in a structure, and then write a loop to use this information to re-parse the file. Consider making an array that has only the start positions of each of the fields:
    # Define the format of the file my (@file_format) = ( 0, 3, 53, 60, 67, 74, # etc. 241+169, # Last position, presumably );
    Now the length of each field $n, for substr() purposes, at least, is simply $file_format[$n+1] - $file_format[$n]. Note that the last entry in the table shouldn't be used, that is, $n should only go as high as $#file_format-1.

    Now you can put each line into an array as you read it in, and then write it to a file straight away. Just open both files at the same time using two different filehandles, such as IN_FILE and OUT_FILE. You are putting your data into temporary arrays, but since the data is only used exactly once.
    my (@field_data); for (my $i = 0; $i < $#file_format; $i++) { $field_data[$i] = substr($_, $file_format[$i], $file_format[$i+1]- $file_format[$i]); # Clean up as required, by trimming $field_data[$i] =~ s/\s+$//; } print OUT_FILE join('|', @field_data);
    If you want, you can use unpack instead, but apart from stylistic differences, there is no real point unless you need maximum speed (i.e for 5 million line files, or what have you).
Re: Converting fixed record length files to pipe delimited
by unixwzrd (Beadle) on Feb 20, 2001 at 06:32 UTC
    I had a similar situation where I had a fixed length file generated on a mainframe. I actually used this to do some edits and inserted rows into an Oracle database, but I've shortened it a bit here and used joining the record with a "pipe":
    #!/usr/bin/perl use strict; my @record_layout = qw( state_code place_code state_alpha_code class_code place_name county_code county_name zip_code ); my %field_types = ( state_code => 'A2', place_code => 'A5', state_alpha_code => 'A2', class_code => 'A2', place_name => 'A52', county_code => 'A3', county_name => 'A22', zip_code => 'A5' ); my %fips_data; my $fips_template = join(" ", @field_types{@record_layout}); while(my $fips_line = <>){ @fips_data{@record_layout} = unpack($fips_template, $fips_line); next if $fips_data{'state_code'} == 52; print STDOUT join('|', @fips_data{@record_layout}); }
    Update: This post and its follow-up keep getting panned. It would be nice to get some constructive criticism rather than watching the numbers continue to fall on this, after all I would like to know what's wrong or could be done better so I can grow as a Perl programmer.

    Thanks,
    Mike

    "The two most common elements in the universe are hydrogen... and stupidity."
    Harlan Ellison

      Oh, one other thing I forgot to mention, I was only using "A" data types, but this method would work for any type of fixed records with binary or other embedded data types in it, just simply change the field types for the record layout...

      Mike

      "The two most common elements in the universe are hydrogen... and stupidity."
      Harlan Ellison
OffTopic: Formatting Tags
by stefan k (Curate) on Feb 19, 2001 at 20:19 UTC
    PLEASE,
    be careful with those <pre> tags and use <code> tags to format your code.

    It would be useful, if you told us what exactly is not working. A quick glance at your code shows that you got a somewhat un-perlish writing ;-)

    Maybe you should make yourself comfortable with the split() command. You could do something like

    ($field1, $field2, undef, undef ...) = split /\s+/, $line;

    which discardes the undef parts and let's you use the parts you wanted to have....

    Regards Stefan K

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://59390]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2024-04-23 08:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found