Re^2: dumb regex question

in reply to Re: dumb regex question
in thread dumb regex question

I just noticed that this regex fails for the following input:

/gnomes more data here
[download]

My expected string is only /gnomes, whereas it matches everything upto end of the line.. Any idea on how to fix this?

thanks

Comment on Re^2: dumb regex question Download Code

Replies are listed 'Best First'.
Re^3: dumb regex question by ikegami (Patriarch) on Apr 07, 2009 at 01:15 UTC
`if (m{"(/[^"]+)"\|(/\S+)}) { my $match = defined $1 ? $1 : $2; ... }` [download] Or whatever's appropriate instead of `\S`. Update: Fixed slashes	[reply] [d/l] [select]
Re^4: dumb regex question by Nkuvu (Priest) on Apr 07, 2009 at 01:25 UTC
...yeah. Or that. Although the regex as given needs a tweak, with embedded slashes in there. If it wasn't late in the day on a Monday, I might have come up with a regex that would work. Maybe. But at least the Text::CSV_XS solution is not totally wrong.	[reply]
Re^3: dumb regex question by Nkuvu (Priest) on Apr 07, 2009 at 01:01 UTC
With that additional qualification, it will get a bit more tricky. My first thought was to add a space to the character class: `m,"?(/[^" ]*)"?,` But that doesn't work because it won't care that it has found a space inside or outside of a quote, and will stop the regex. Meaning it would capture just "/bootMe" from the line "/bootMe any text here". I'd suggest looking into a module like Text::xSV or Text::CSV_XS and setting the delimiter to spaces. Then reject any entry that doesn't have a leading slash. This means dropping the regex entirely. Something like: #!/usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({sep_char => ' '}); while (my $line = <DATA>) { chomp $line; # See perldoc Text::CSV_XS for warnings # about this approach with possible embedded # newlines: my $status = $csv->parse($line); my @fields; if ($status) { @fields = $csv->fields(); } else { warn "Problem parsing $line\n"; } for my $field (@fields) { print "Captured ($field) from $line\n" if $field =~ m!^/!; } } __DATA__ "/moreIters 10" "/bootMe any text here" /fewIter /some stuff here "/albatross" foo bar baz monkeys leprechauns /not monkeys /gnomes "not leprechauns though" /gnomes more data here [download] Which gives the output: `Captured (/moreIters 10) from "/moreIters 10" Captured (/bootMe any text here) from "/bootMe any text here" Captured (/fewIter) from /fewIter Captured (/some) from /some stuff here Captured (/albatross) from "/albatross" foo bar baz Captured (/not) from leprechauns /not monkeys Captured (/gnomes) from /gnomes "not leprechauns though" Captured (/gnomes) from /gnomes more data here` [download]	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom