Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Removing specific array elements

by phamalda (Novice)
on Jan 29, 2016 at 22:15 UTC ( [id://1154045]=perlquestion: print w/replies, xml ) Need Help??

phamalda has asked for the wisdom of the Perl Monks concerning the following question:

First and foremost, thank you guys for the unending information. Great stuff! I am working on a script that splits a line into an array delimited by tildes. I need to remove a couple of those resulting elements if possible.

This is rather than removing them based on the regular expressions identified. To make this more challenging, I need to remove elements 30 through 38 rather than elements at either end. Below is the piece of code that does that:

my @field = split("~", $line); if ($field[11] eq '000015') { $line =~ s/~~~~~~~~~10~/~10~/g; push(@newlines,$line); }

These lines are quite long and there is a possibility that sequence might be matched later on in the string and that would be bad. Any help would be greatly appreciated.

Replies are listed 'Best First'.
Re: Removing specific array elements
by 1nickt (Canon) on Jan 29, 2016 at 22:25 UTC

    "I need to remove elements 30 through 38"

    splice
    Removes the elements designated by OFFSET and LENGTH from an array, an +d replaces them with the elements of LIST, if any.

    update: cite docs
    The way forward always starts with a minimal test.
Re: Removing specific array elements
by GrandFather (Saint) on Jan 29, 2016 at 23:18 UTC

    Unless you input these strings in this form and output them in the same form and this is the only manipulation you need to do I'd strongly recommend you turn the string into an array where dealing with elements by index is natural.

    Perl has powerful ways of manipulating lists including splice, grep, map and array slices. Performing significant array manipulation on a string will make your code hard to read and hard to maintain.

    However, in this case it may be that a hash is a better choice. Using 'field' for your variable name implies that the elements have significance related to their original position. By removing elements you mess up that mapping and make it hard to track what is where in the list. Better to name the fields by using a hash then manipulate fields by name. Show us soome of the bigger problem and we can help show you how that goes.

    Premature optimization is the root of all job security
Re: Removing specific array elements
by stevieb (Canon) on Jan 29, 2016 at 23:51 UTC

    Show us a few lines of your input (in <code></code> tags as you've put your code into).

    It'd also help if you could share expected output data relative to the input.

      Thank you all for the quick replies. I apologize for the ambiguity in the first post and for the delayed response. Below is my entire piece of code:

      #!/usr/bin/perl -w use strict; # Enforce good programming rules use warnings; # Provides run-time warnings my $infile; # Variable for input file my $outfile; # Variable for output file my $filetype; # Variable to store whether the file is Standard or E +nhanced my @newlines; # Array variable to hold line in incoming text file variable. if (not defined $ARGV[0]) { die "Usage: ReplaceNewLine.pl <inputfile>\n"; } else { $infile = $ARGV[0]; } open(INFILE, $infile) || die "$infile not found in current location.\n +"; chomp(my @lines = <INFILE>); close(INFILE); # Place header from source file into $header variable my $header = $lines[0]; my @field = split("~", $lines[1]); if ($field[14]) { $filetype = 'Enhanced'; } else { $filetype = 'Standard'; } push (@newlines, $header); if ($filetype eq 'Enhanced') { # Loop through each index in the @lines array. foreach my $line (@lines) { my @field = split("~", $line); if ($field[11] eq '000015') { $line =~ s/$field[12]/92/g; $line =~ s/~~~~~~~~~10~/~10~/g; push(@newlines,$line); } if ($field[11] eq '000030') { $line =~ s/$field[12]/46/g; $line =~ s/~~~~~10~/~10~/g; push(@newlines,$line); } if ($field[11] eq '000060') { $line =~ s/$field[12]/23/g; $line =~ s/~~~10~/~10~/g; push(@newlines,$line); } } # End foreach } # End if if (defined $ARGV[1]) { $outfile = $ARGV[1]; } else { $outfile = "resultfile.txt"; } open my $fh, '>', $outfile or die "Cannot opent $outfile\n"; foreach (@newlines) { print $fh "$_\n"; } close $fh;

      And I have included two lines from the file this is altering:

      MEPMD01~20080519~03092015124400AM~6202820945~PPAY RES~~0012435644-01~OK~E~KWH~~000015~96~03082015121500AM~00~0.288~00~0.2892~00~0.2778~00~0.2484~00~0.4356~00~0.3882~00~0.0702~00~0.2988~~~~~~~~~10~0.279~10~0.2796~10~0.2964~10~0.2718~10~0.2082~10~0.1242~10~0.2568~10~0.258~10~0.2958~10~0.2772~10~0.276~10~0.2928~10~0.2904~10~0.2814~10~0.2556~10~0.276~10~0.0924~10~0.4878~10~0.3366~10~0.2814~10~0.273~10~0.2778~10~0.2862~10~0.2766~10~0.288~10~0.0402~10~0.4326~10~0.2976~10~0.0366~10~0.0318~10~0.0486~10~0.3084~10~0.048~10~0.03~10~0.0246~10~0.0402~10~0.0204~10~0.0204~10~0.051~10~0.0246~10~0.0306~10~0.0342~10~0.0306~10~0.0258~10~0.0402~10~0.0384~10~0.0342~10~0.021~10~0.021~10~0.0366~10~0.0204~10~0.0312~10~0.0372~10~0.0204~10~0.0282~10~0.0432~10~0.021~10~0.0216~10~0.048~10~0.0234~10~0.0192~10~0.0372~10~0.0192~10~0.018~10~0.0474~10~0.0414~10~0.0768~10~0.0588~10~0.0558~10~0.0594~10~0.0702~10~0.1446~10~0.1068~10~0.1548~10~0.1704~10~0.0684~10~0.0936~10~0.0834~10~0.1134~10~0.0666~10~0.039~10~0.048~10~0.0414~10~0.015~

      MEPMD01~20080519~03092015034100AM~2191900749~~~0012435649-01~OK~E~KWH~~000015~96~03082015121500AM~00~0.1134~00~0.9756~00~0.7554~00~0.1578~00~0.123~00~0.0966~00~0.129~00~0.141~~~~~~~~~10~0.1074~10~0.099~10~0.1338~10~0.2334~10~0.108~10~0.369~10~0.405~10~1.1484~10~0.0978~10~0.114~10~0.1182~10~0.1458~10~0.1116~10~0.1272~10~0.0996~10~0.1272~10~0.138~10~0.1146~10~0.1002~10~0.6792~10~0.2934~10~0.957~10~0.0924~10~0.0906~10~0.2772~10~0.2322~10~0.195~10~0.2172~10~0.2004~10~0.3876~10~0.5994~10~0.564~10~0.6258~10~1.3188~10~0.8532~10~1.7514~10~1.41~10~1.8036~10~1.3014~10~1.1052~10~0.6618~10~0.6342~10~0.6264~10~0.7986~10~0.648~10~0.657~10~0.684~10~1.1544~10~0.7764~10~1.8684~10~1.8576~10~1.1862~10~1.2612~10~1.4802~10~1.767~10~2.5086~10~2.2452~10~1.3728~10~0.6984~10~0.7044~10~1.956~10~1.9356~10~1.9284~10~1.7316~10~2.1882~10~1.6014~10~1.5906~10~1.8288~10~1.9512~10~1.0224~10~2.8374~10~2.6724~10~0.6828~10~0.9708~10~0.7284~10~0.8832~10~0.483~10~0.5736~10~1.191~10~0.294~10~0.3~10~0.1716~10~0.1836~10~0.1692~

      My split creates an array for each string with a tilde delimiter. You can see each both line, there is a string of tildes '~~~~~~~~~' but they are at different character count in each line. Another problem is that in rare cases, this string of tildes might appear later in the line and I cannot remove it if it does.

      One direction I had was to put each line array in a while loop inside that foreach and count through each element of the array and move it to the farthest null element to the left but the nested loop creates a lot of overhead for each line of text.

      I hope this clears up some of the details. Again, any help would be greatly appreciated.

      Pham

        If I have read correctly you are scanning an input file and outputting changed lines only where column 12 is 000015, 000030 or 000060. Column 13 is changed to 92, 46 or 23 and a number of blank columns (8, 4 or 2) starting at 30 are removed.

        Since there looks to be a pattern to your changes I have coded them into a hash to avoid the multiple ifs.

        #!/usr/bin/perl use strict; use warnings; my $infile = $ARGV[0]; my $outfile = $ARGV[1] || 'resultfile.txt'; unless (@ARGV) { die "Usage: ReplaceNewLine.pl <inputfile> [<outputfile>] \n"; } open IN, '<', $infile or die "$infile not found in current location"; open OUT, '>', $outfile or die "Cannot open $outfile"; # Place header from source file into $header variable my $header = <IN>; print OUT $header; # changes my %change = ( '000015' => [92,30,8], '000030' => [46,30,4], '000060' => [23,30,2], ); # enhanced or standard files my @field = split "~",<IN>;; my $filetype = $field[14] ? 'Enhanced' : 'Standard'; if ($filetype eq 'Enhanced') { # process lines my $count=0; foreach my $line (<IN>) { ++$count; chomp($line); my @field = split "~", $line; if ( exists $change{$field[11]} ) { $field[12] = $change{$field[11]}[0]; my $posn = $change{$field[11]}[1]; my $cut = $change{$field[11]}[2]; my $discard = join '',splice(@field,$posn,$cut); warn "Warn : $discard discarded at line $count" if ($discard); print OUT join '~',@field; print OUT "\n"; } } print "$count lines processed from $infile\n"; } close IN; close OUT;

        Note the first 2 lines of input are not processed

        poj

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1154045]
Approved by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-03-19 05:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found