akm2 has asked for the wisdom of the Perl Monks concerning the following question:
This node falls below the community's threshold of quality. You may see it by logging in.
Re: Converting fixed record length files to pipe delimited
by agoth (Chaplain) on Feb 19, 2001 at 20:26 UTC
|
open (OUTFILE, >$file) or die;
while (<FILE>) {
my @ary = unpack('A35 A30 A15', $_);
my @tmp = @ary[0..4];
print OUTFILE (join '|', @tmp);
}
close OUTFILE, FILE;
| [reply] [d/l] |
Re: Converting fixed record length files to pipe delimited
by davorg (Chancellor) on Feb 19, 2001 at 20:30 UTC
|
open(INFILE, $file) or die "Can't open $file: $!\n";
open(OUTFILE, $file) or die "Can't open $outfile: $!\n";
# widths of the cols
my @cols = qw(3 50 7 7 7 7 6 6 6 6 55 4 41 6 23 7 169);
# build unpack format
my $fmt = join '', map { "A$_" } @cols;
# column names
my @col_names = qw(mfg model spec1 spec2 spec3 spec4 opt1
opt2 opt3 opt4 desc qoh trash1 mult
trash2 list trash3);
while (<INFILE>) {
my %rec;
$rec{@col_names} = unpack $fmt, $_;
next if $rec{model} = 'COMPONENT PARTS';
print OUTFILE join('|', $rec{@col_names})
}
Which looks a bit simpler than your version :)
--
<http://www.dave.org.uk>
"Perl makes the fun jobs fun
and the boring jobs bearable" - me
| [reply] [d/l] |
Re: Converting fixed record length files to pipe delimited
by arturo (Vicar) on Feb 19, 2001 at 20:33 UTC
|
Basic Technique: read the records in, trim them, then use perlfunc:join to generate the output form. Probably your best bet is to read in each line, put all the 'keeper' fields into an array, then loop through the array and print the joined array out to a file. Here's one, relatively easily grokkable way to do it:
# for each line,
# get fields you're keeping, put them into @fields
# in the proper order
foreach my $field (@fields) {
$field =~ s/^\s*(.*?)\s*$/; # trim whitespace -- but beware!
# two-command version (see the FAQ in perlfaq)
# $field =~ s/^\s*//;
# $field =~ s/\s*$//;
}
print OUTPUTFILEHANDLE join "|" @fields;
# now process the next line
HTH
Philosophy can be made out of anything. Or less -- Jerry A. Fodor | [reply] [d/l] [select] |
(boo) Re: Converting fixed record length files to pipe delimited
by boo_radley (Parson) on Feb 19, 2001 at 20:45 UTC
|
This the part unpack was born to play. It'll let you defile define field lengths, which you can stuff into an array.
I'm taking a guess as to what the field names might be, and what the field lengths would be; it'd have been more useful to know those than to actually see the text ;)
if you wanted just the first 5 fields, and rework the first, third and second :
($part_code, $part_size, $part_size2, $quantity, $cryptic_field) = unp
+ack ("A54 A7 A7 A10 A5", $line_from_file);
print OUT join ("|", ($partcode, $part_size2, $part_size));
make sense? | [reply] [d/l] |
Re: Converting fixed record length files to pipe delimited
by tadman (Prior) on Feb 19, 2001 at 20:45 UTC
|
Yikes! That code didn't get wrapped for some reason, and
it's throwing the navigation table way off kilter. Anyway.
The code you posted is really Perl 4 style, with a whole
whack of arrays instead of the Perl 5 style Array of Arrays
(or AoA as you will hear more often). AoA is a much easier
way to implement what you have done. Easier is better, no?
I would define your input file format, first, in a structure,
and then write a loop to use this information to re-parse
the file. Consider making an array that has only the start
positions of each of the fields:
# Define the format of the file
my (@file_format) = (
0,
3,
53,
60,
67,
74,
# etc.
241+169, # Last position, presumably
);
Now the length of each field $n, for substr() purposes, at
least, is simply $file_format[$n+1] -
$file_format[$n]. Note that
the last entry in the table shouldn't be used, that is,
$n should only go as high as $#file_format-1.
Now you can put each line into an array as you read it in,
and then write it to a file straight away. Just open both
files at the same time using two different filehandles,
such as IN_FILE and OUT_FILE. You are putting
your data into temporary arrays, but since the data is only
used exactly once.
my (@field_data);
for (my $i = 0; $i < $#file_format; $i++)
{
$field_data[$i] = substr($_,
$file_format[$i],
$file_format[$i+1]-
$file_format[$i]);
# Clean up as required, by trimming
$field_data[$i] =~ s/\s+$//;
}
print OUT_FILE join('|', @field_data);
If you want, you can use unpack instead, but apart from
stylistic differences, there is no real point unless you
need maximum speed (i.e for 5 million line files, or what
have you).
| [reply] [d/l] [select] |
Re: Converting fixed record length files to pipe delimited
by unixwzrd (Beadle) on Feb 20, 2001 at 06:32 UTC
|
I had a similar situation where I had a fixed length file generated on
a mainframe. I actually used this to do some edits and inserted rows
into an Oracle database, but I've shortened it a bit here and used
joining the record with a "pipe":
#!/usr/bin/perl
use strict;
my @record_layout = qw(
state_code
place_code
state_alpha_code
class_code
place_name
county_code
county_name
zip_code
);
my %field_types = (
state_code => 'A2',
place_code => 'A5',
state_alpha_code => 'A2',
class_code => 'A2',
place_name => 'A52',
county_code => 'A3',
county_name => 'A22',
zip_code => 'A5'
);
my %fips_data;
my $fips_template = join(" ", @field_types{@record_layout});
while(my $fips_line = <>){
@fips_data{@record_layout} = unpack($fips_template, $fips_line);
next if $fips_data{'state_code'} == 52;
print STDOUT join('|', @fips_data{@record_layout});
}
Update: This post and its follow-up keep getting panned.
It would be nice to get some constructive criticism rather than
watching the numbers continue to fall on this, after all I
would like to know what's wrong or could be done better so
I can grow as a Perl programmer.
Thanks,
Mike
"The two most common elements in the universe are hydrogen... and stupidity."
Harlan Ellison
| [reply] [d/l] |
|
Oh, one other thing I forgot to mention, I was only using "A"
data types, but this method would work for any type of fixed
records with binary or other embedded data types in it, just
simply change the field types for the record layout...
Mike
"The two most common elements in the universe are hydrogen... and stupidity."
Harlan Ellison
| [reply] |
OffTopic: Formatting Tags
by stefan k (Curate) on Feb 19, 2001 at 20:19 UTC
|
PLEASE,
be careful with those <pre> tags and use <code> tags to format your code.
It would be useful, if you told us what exactly is not working.
A quick glance at your code shows that you got a somewhat un-perlish
writing ;-)
Maybe you should make yourself comfortable with the split() command. You could do something
like
($field1, $field2, undef, undef ...) = split /\s+/, $line;
which discardes the undef parts and let's you use the
parts you wanted to have....
| [reply] [d/l] [select] |
|
|