Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Split lines in file to columns

by pgmer6809 (Sexton)
on Mar 24, 2019 at 22:09 UTC ( #1231629=note: print w/replies, xml ) Need Help??


in reply to Split lines in file to columns

Here is some code that makes use of a couple of very nice perl features.

1) You can put your data after the _END_ statement and then have while read it in, rather than putting it in an array at the source. Makes for easier setting of test cases.

2) It uses REGEX power of Perl to split the line. This is much more flexible than split, and is the usual way that perl programmers parse stuff.

I have added a couple of lines to your original input to show that a) if it is not REGION but say HQ the value is not replaced. Ditto if it is REGION but the original value is not XYZ. Not sure if that is exactly what you meant but the concept should be useful.

#!/usr/bin/perl -w while (<DATA>) { #read a line into $_ chomp; $_ =~ m/^\s*(\S+)\s+(\S+)\s+(\S+)\s+(.*)$/; # col1 col_b xyz Descr my ($col1, $data_b, $xyz, $description ) = ($1, $2, $3, $4); if ( ( $description =~ m/Region/ ) && ( $xyz eq "XYZ" ) ) { $xyz = + "N/A"; } print "Line $. = $_\n"; print "\tXYZ result = $xyz \n"; } #end while DATA exit 1; __END__ ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription at RowCode ValuesInColumn1 DataColumnB ABC RowDescription at RowCode ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::HQ ValuesInColumn1 DataColumnB BCD RowDescription|RowCode|Suppli +er ID::Region

The result of running the above is:

Line 1 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 2 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 3 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 4 = ValuesInColumn1 DataColumnB XYZ RowDescription at Ro +wCode XYZ result = XYZ Line 5 = ValuesInColumn1 DataColumnB ABC RowDescription at Ro +wCode XYZ result = ABC Line 6 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::HQ XYZ result = XYZ Line 7 = ValuesInColumn1 DataColumnB BCD RowDescription|RowCo +de|Supplier ID::Region XYZ result = BCD

Replies are listed 'Best First'.
Re^2: Split lines in file to columns
by haukex (Archbishop) on Mar 24, 2019 at 22:26 UTC
    You can put your data after the _END_ statement

    Normally, one would use the __DATA__ token for this purpose - see Special Literals in perldata.

    Update: Another nitpick: The special variables $1 etc. should only be used if the match succeeds. And the two lines could be shortened to: my ($col1, $data_b, $xyz, $description) = /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(.*)$/ or die "Failed to parse: $_";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1231629]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (1)
As of 2022-11-26 08:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?