Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

RegExp help

by karmas (Sexton)
on Nov 01, 2001 at 20:32 UTC ( [id://122582]=perlquestion: print w/replies, xml ) Need Help??

karmas has asked for the wisdom of the Perl Monks concerning the following question:

I'm parsing some text file in following format:
some text ()\xA1 file.txt
and generating html. $1 (below) sometimes gets trailing spaces, so I thought to strip'em off:
(1)if ($line =~ /(.+)\s*\(\)\xA1\s*(.+\.txt)/) { (2) $file = $1; (3) $file =~ s/\s*$//; (4) print OUT "<a href=\"$2\">$file</a><br>\n" }
But I'm getting error: "Use of uninitialized value in concatenation (.) or string at F:\downloads\lib.pl line 23, <IN> line 4." If I comment (3) everything works fine (except trailing spaces of course). It looks that $2 is somehow invalidated in line 3. TIA

Replies are listed 'Best First'.
Re: RegExp help
by Fastolfe (Vicar) on Nov 01, 2001 at 20:39 UTC
    You're right; $2 is getting clobbered when your second regex succeeds. You need to save it away in another variable someplace before you do your next regex. A simple test:
    my $string = "one two three four five"; my $tmp = "this doesn't matter"; $string =~ /(\S+) (\S+) (\S+) (\S+) (\S+)/; # $1 through $5 are set print "\$3=$3\n"; $tmp =~ s/\S*$//; # random regex print "\$3=$3\n";
    The output (with warnings):
    $3=three Use of uninitialized value in concatenation (.) or string at test line + 10. $3=
Re: RegExp help
by japhy (Canon) on Nov 01, 2001 at 20:48 UTC
    The reason is that $file =~ s/\s*$// is clearing the $DIGIT variables and setting its own (which happen to be undef, since they're not defined by the regex). Perhaps you want to take this approach:
    if (my ($f, $l) = $line =~ /(.+)\s*\(\)\xA1\s*(.+\.txt)/) { $f =~ s/\s+$//; # notice \s+ and not \s*, too print OUT qq{<a href="$l">$f</a><br>\n}; }

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: RegExp help
by tfrayner (Curate) on Nov 01, 2001 at 20:50 UTC
    So near :-)

    The other comments above are of course correct, but the insertion of a critical '?' (to prevent greedy matching) also seems to work:

    $line="some text \(\)\xA1 file.txt"; if ($line =~ /(.+?)\s*\(\)\xA1\s*(.+\.txt)/) { $file = $1; # $file =~ s/\s*$//; print ("<a href=\"$2\">$file</a><br>\n"); }
    Hope this helps,

    Tim

(jeffa) Re: RegExp help
by jeffa (Bishop) on Nov 01, 2001 at 20:42 UTC
    If that weird thingy "()\xA1" is always going to be the same, then use split instead of a regex:
    use strict; my $str = 'some text ()\xA1 file.txt'; my ($desc,$file) = split (/\s*\(\)\\xA1\s*/,$str); print "<a href=\"$file\">$desc</a><br>\n";
    I changed your $file var from $1 to $2, simply because i think it makes more sense that way. Padding the delimiter with optional whitespace will take care of trailing space, but this only works if that weird thingy is always going to be the same.

    jeffa

Re: RegExp help
by karmas (Sexton) on Nov 01, 2001 at 21:02 UTC
    Thank you for help! There's many things left to learn :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://122582]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-16 13:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found