Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

anonymous subroutines assigning regexes to hash keys

by donkeykong (Novice)
on Jul 30, 2009 at 05:03 UTC ( [id://784481] : perlquestion . print w/replies, xml ) Need Help??

donkeykong has asked for the wisdom of the Perl Monks concerning the following question:

Sorry, the title is complex, but I didn't know how to put it into easier terms, but it will look simpler when written in code. I'm trying to read files and as I read them, set up a hash of a hash, and the values of the inner hash keys will be gotten by regular expressions on $_ while reading the file. So, the code would look like this:
use strict; my $file = shift @ARGV; open (INPUT, "<$file"); my @ids = qw(1234 2345 3456); my %hash; foreach my $id (@ids) { while (<INPUT>) { $hash{$id} = { headline => sub { return /.*headline: (.*)/ }, byline => sub { return /.*byline: (.*)/}, }; } print $hash{$id}->{byline}->()."\n"; } close INPUT;
I keep getting a '1' as my output. I tried a couple different things but nothing got me any good results, outside of writing a subroutine outside of the hash and calling it as a reference in each value in place of the anonymous subroutines. Any help you guys can provide would be great. Thanks

Replies are listed 'Best First'.
Re: anonymous subroutines assigning regexes to hash keys
by ikegami (Patriarch) on Jul 30, 2009 at 06:13 UTC

    Others have pointed out other problems with your code, but your actual question has gone unanswered. (Well, I answered it when you asked in the CB earlier, but I guess you missed it.)

    You're calling the anonymous sub in scalar context, which results in the match being evaluated in scalar context, and m// in scalar context returns whether a match occurred or not.

    You need to call m// in list context or use $1 to access what the match captured. Here are two ways to fix your anon subs to return the desired value:

    sub { ( /.*headline: (.*)/ )[0] },
    sub { /.*headline: (.*)/ && $1 },
      Thanks so much for responding to the actual question. I just wrote an abstract piece of code, not the actual code that I was using. I tried both of the solutions though, and neither brought back a result. Any ideas? I appreciate it.

        It's impossible for those expressions to return nothing back.

        It's impossible an expression to nothing back in scalar context.

        If they return undef, then the pattern didn't match. Make sure $_ contains what you think it should.

        If they return a zero-length string, it's because that's what they successfully matched.

Re: anonymous subroutines assigning regexes to hash keys
by GrandFather (Saint) on Jul 30, 2009 at 05:41 UTC

    I really can't figure out what you think that code should do. However, if you add some data to the FILE block in the following code, then tell us what you expect to see as a result of running it, we may be able to get a handle on what you want to do.

    use strict; use warnings; my $file = <<FILE; FILE open my $inFile, '<', \$file; my @ids = qw(1234 2345 3456); my %hash; foreach my $id (@ids) { while (<$inFile>) { $hash{$id} = { headline => sub { return /.*headline: (.*)/ }, byline => sub { return /.*byline: (.*)/}, }; } print $hash{$id}->{byline}->()."\n"; } close $inFile;

    A few of the problems I see are:

    1/ You open a file once outside a for loop, read to the end of the file on the first iteration of the loop, then want to read more stuff on subsequent iterations of the loop.

    2/ Your anonymous subs use $_ for matching against, but $_ doesn't have a sensible value in the print where the sub might be called.

    3/ Although you use strict, execution fails with 'Can't use string ("") as a subroutine ref while "strict refs" in use at ...'. Why didn't you mention this problem?

    4/ Without strictures execution fails with 'Undefined subroutine &main:: called at ...'. Why didn't you tell us about this problem?

    5/ You should always use the three parameter version of open and check the result.


    True laziness is hard work
Re: anonymous subroutines assigning regexes to hash keys
by pubnoop (Acolyte) on Jul 30, 2009 at 05:50 UTC

    I don't understand what you're trying to do (it might help if you explain what you intend to do with the hash of hahses once you've got it), but part of your problem is that your loops are in a bit of a mess. You're opening the file, and then for each of the 3 id values you're working though each line of the file (which won't work because the while (<INPUT>) eats the whole file on the first go when $id is 1234, leaving no more input for the other 2 id values). If you want to work through the file 3 times then you have to either close and open it again each time or reset the position with seek().

    Each time around the while (<INPUT>) loop, you assign your hash of subs to the same thing, $hash{$id}. With a 100 line file, you'll assign to $hash{$id} 100 times for id 1234, and all but the last one will be lost.

      I was just making a very abstract version of what I intend on doing. There are going to be several files, and those files come with ID numbers, and the array is going to get automatically populated with the id numbers. So when I go through the file per ID number, it's because it's supposed to go through that entire file per ID number. I can see how this would look wrong based on the abstract version I put up. I was just trying to show what it would do in terms of the actual anonymous subs in the hash of hashes. This is what the inside of a file will look like:
      headline: this is the headline byline: this is the byline (and so on)
      I want to grab all the stuff after the (text):. So the inner hash has the key, which is supposed to have the value of the text after the colon, and was thinking of using an anonymous subroutine to perform a regex and capture just what I wanted. Does this make sense?