http://qs321.pair.com?node_id=1074768

GertMT has asked for the wisdom of the Perl Monks concerning the following question:

hi,
The file I'm working on is just one big line. There are multiple occurrences of the pattern I'm looking for in this single line. The regex I currently have only shows the last occurrence 80 times? I'm not seeing what I should change.

Patter is a few letters [a-zA-Z] after "photo/" followed by a dot and three [a-z].

#!/usr/bin/perl use strict; use warnings; my $count = 1; open( FILE, $ARGV[0] ); while ( my $w = <FILE> ) { foreach ( $w =~ m/photo\/([a-zA-Z]+\.[a-zA-Z]{3})/g ) { print "Photo: $count $1\n"; $count++; } } $count = $count - 1; print "\nI counted $count image-files\n";

Replies are listed 'Best First'.
Re: Regex shows only last match multiple times?
by zeltus (Beadle) on Feb 13, 2014 at 10:17 UTC

    You can make your code rather more simpler and hence, easier to read:

    Initialise $count to zero

    my $count = 0;

    ...or, even more simply

    my $count;

    Then you don't need that perverse

    $count = $count - 1;

    statement. And even if you did want it, what's wrong with

    $code--;

    (always bearing in mind the old TIMTOWTDI acronym of course!)

    The next simplification, if the file really does consist of just one line, is to slurp it in, in one hit

    @array = <$fh>; # Reads all lines into array $singleLine = <$fh>; # reads one line only

    Finally, and this is where I think your question might get answered, you can capture regex hits to an array and then simply count the array...

    my @count = $w =~ /$x/g; my $total = @count;

    With no loops making it look more complicated than it is, you can then start to pin down what is going on here. Perhaps use some test data/test regex to give a simpler starting block. I would also be suspicious of perl being greedy in that there regex. But first, create some test data and make sure you can "prove" your code before hitting the real data.

      thanks,
      changed a few more things indeed and it now works as expected
      how to exit the loop after complete 3 rounds...?? it means... if you enter 3 times wrong input it should exit from the program... here i am providing code please modify it
      #!/user/bin/perl print"enter the rang of array:\n"; chomp($c=<STDIN>); while($i=1){ print"while loop round:$i\n"; if($i>3){ exit; } if($c =~ /^\[|[a-z,A-Z]+/){ print "it is invalid please enter digits\n"; #print"enter the rang of array:\n"; #chomp($c=<STDIN>); } elsif($c =~ /^[1-9]+/){ print "you are entered correct input\n"; last; } $i++; } @a=(); $b; $1; print"Enter the array:\n"; for($i=0;$i<$c;$i++){ chomp($b=<STDIN>); $a[$i]=$b; } #for($i=0;$i<@a;$i++){ #print"the array $a[$i]\n"; #} #print"the PSI ID's:"; foreach $val(@a){ #print"####$val###\n"; if($val=~ /^([a-z]+)\s?PSI\-ID\-?(\d+)/){ print"value=$2\n"; push(@b,$1); push(@c,$2); } } print"the pax id's with PSI:\n"; for($i=0;$i<=@b;$i++){ for($j=0;$j<=@c;$j++){ if($i==$j){ print"$b[$i] $c[$j]\n"; } } }

        If you add use warnings; to the head of your script, Perl will tell you that the opening of the while loop:

        while ($i = 1) {

        contains a logic error: it re-initialises the variable to 1 on each loop iteration. In fact, this loop would benefit from a complete re-write:

        use strict; use warnings; my $c; for (my ($valid, $try) = (0, 1); !$valid; ++$try) { print "Enter the range of the array (try $try):\n"; chomp($c = <STDIN>); if ($c =~ /^\d+$/) { print "You have entered a valid range: $c\n"; $valid = 1; } elsif ($try < 3) { print "The input is invalid, please enter digits only\n"; } else { print "No valid array range entered, exiting\n"; exit; } } print "Continuing...\n";

        This is one of the unusual cases in which a C-style for loop is useful in Perl. Please note:

        • I have also added use strict; and declared all variables as lexicals with my. The amount of time this will save you down the track far outweighs the (very small) extra effort required.
        • In future, please put <code> ... </code> tags around your code, to make it readable.
        • When asking a question (as opposed to commenting on someone else’s answer), it’s usually better to open a new thread.

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Regex shows only last match multiple times?
by Eily (Monsignor) on Feb 13, 2014 at 10:43 UTC

    A data sample would have been helpful. Not your whole data set, but just enough to try your code by ourselves.

    The problem with your code is that, foreach ( $w =~ /(capture)/g) will run the regex through the whole string, setting each time $1, and then start looping through the list. But the elements will be aliased to $_, not $1 which won't change. Write print "Photo: $count $_\n" instead, and all shall be well.

    Instead of opening $ARGV[0] yourself, you could try using the diamond operator's magic.

    my $count = 0; while(<>) { $count+= () = m< photo/([a-z]+\.[a-z]{3}) >gxi; }
    This will loop through all the files in the parameters (which allow you to supply several filenames instead of one. The currently processed file is $ARGV. If @ARGV is empty, your script will work on STDIN instead.

Re: Regex shows only last match multiple times?
by GertMT (Hermit) on Feb 13, 2014 at 10:26 UTC
    aha,
    Answering my own question..
    Obviously I should use while as opposed to foreach as it otherwise keeps interpolating the variable $1