Re: Regular Expression: search two times for the same number of any signs
by haukex (Archbishop) on Nov 29, 2016 at 10:45 UTC
|
Hi,
What's unclear is whether this string is allowed to be embedded inside a longer string, although your example regexes seem to suggest that it's ok. Second, it would be good to know if multiple of these x.x.x sequences are allowed to be present in the source string? Should "ax1x2xbx34x56xc" return two strings, "x1x2x" and "x34x56x", or the single string "x1x2xbx34x56x"?
Here's my TIMTOWTDI solution:
print "$_ => ", extract($_)//'invalid', "\n"
for qw/ xxx x.x.x x12x..x x123x...x x1x2x...x
x123x.x.x x12x1x ax1x2xbx34x56xc /;
sub extract {
my ($x) = shift=~/(x.*x.*x)/;
return unless length($x)%2
&& substr($x,(length($x)-1)/2,1) eq 'x';
return $x;
}
__END__
xxx => xxx
x.x.x => x.x.x
x12x..x => x12x..x
x123x...x => x123x...x
x1x2x...x => x1x2x...x
x123x.x.x => x123x.x.x
x12x1x => invalid
ax1x2xbx34x56xc => x1x2xbx34x56x
Hope this helps, -- Hauke D | [reply] [d/l] [select] |
|
yes. The pattern can be in a larger string.
ax.x.x # is valid
ax.x.xaa # is valid
yes. The pattern x.x.x is allowed to be multiple times in the string. But it is enough to find it at least one time.
| [reply] [d/l] |
|
Hi Hauke,
Took me a while to understand your code !!
I can learn a lot out of it. Thanks !!!
But one question:
I never saw // before. What is // ?
You use "extract($_)//'invalid'" to print 'invalid' if the sub returns nothing.
Can I do more with // ?
| [reply] |
|
| [reply] |
Re: Regular Expression: search two times for the same number of any signs
by Eily (Monsignor) on Nov 29, 2016 at 11:03 UTC
|
use v5.20;
while (<DATA>)
{
chomp;
say "$_ => '$1' x '$2'" if /x(.*)x((??{ ".{".length($1)."}" }))x/;
}
__DATA__
xaxxax
xxx
x.x.x
x12x..x
x123x...x
x123x.x.x
xxx => '' x ''
x.x.x => '.' x '.'
x12x..x => '12' x '..'
x123x...x => '123' x '...'
x123x.x.x => '123' x '.x.'
| [reply] [d/l] [select] |
Re: Regular Expression: search two times for the same number of any signs (updated)
by haukex (Archbishop) on Nov 29, 2016 at 11:02 UTC
|
Hi,
Disclaimer: I am not a regex wizzard, so I'm not sure if the following has any pitfalls, but it does appear to be possible with a single regex:
print $_, /(x(.*)x(??{ '.{'.length($2).'}' })x)/
? " matches, \$1 = $1\n" : " doesn't match\n"
for qw/ xxx x.x.x x12x..x x123x...x x1x2x...x
x123x.x.x x12x1x ax1x2xbx34x56xc /;
__END__
xxx matches, $1 = xxx
x.x.x matches, $1 = x.x.x
x12x..x matches, $1 = x12x..x
x123x...x matches, $1 = x123x...x
x1x2x...x matches, $1 = x1x2x...x
x123x.x.x matches, $1 = x123x.x.x
x12x1x doesn't match
ax1x2xbx34x56xc matches, $1 = x1x2xbx34x56x
Update: Changing the first part of the regex to x(.*?)x (non-greedy) will allow you to match all the substrings in that last example above (and the rest of the examples above will continue to work the same):
my $re = qr/(x(.*?)x(??{ '.{'.length($2).'}' })x)/;
my $str = "ax1x2xbx34x56xc";
while ($str=~/$re/g) {
print "found \"$1\"\n";
}
__END__
found "x1x2x"
found "x34x56x"
Hope this helps, -- Hauke D | [reply] [d/l] [select] |
|
print "$_ =>\n"
for qw/ 1 12 123/;
is working fine. I like this style. But I can not combine it with an if or multiple lines.
{print "$_ =>\n" if $_=/1/ }
for qw/ 1 12 123/;
gives me floowing error message:
"Missing $ on loop variable at ./test3.pl line 2."
and I can not understand, what is mean by this error meassage.
| [reply] [d/l] [select] |
|
for my $n (qw/1 12 123 234/) {
print "$n =>\n" if $n=~/1/;
}
(Ok, there is a way to do what you want, but legibility begins to suffer if it gets longer: /1/ and print "$_ =>\n" for qw/1 12 123 234/;)
Update: I should add that I was golfing a little bit in my example code, and that compressed style is not necessarily something one should strive to use in production code ;-)
Regarding the other question about (??{ }), that's documented along with (?{ }) in perlre. The oversimplified explanation is that the code inside (??{...}) is evaluated and its return value embedded as part of the regular expression (but make sure to read the docs). So in my regex, the code '.{'.length($2).'}' takes the length of the string matched in between the first set of x's (x(.*)x), and then generates an expression like .{N} (where N is the length), so if the input were x12345x67890x, the regular expression it is matched against is x.*x.{5}x.
Hope this helps, -- Hauke D
Updated wordings a little bit. | [reply] [d/l] [select] |
|
|
Hi Hauke,
Perfect solution. Exact what I wanted to have.
But I do not understand the ?? { } part.
Can you explain a little bit or give me a link where I can read more.
many thanks !!!
| [reply] |
Re: Regular Expression: search two times for the same number of any signs
by Ratazong (Monsignor) on Nov 29, 2016 at 10:17 UTC
|
Hi
I would recommend to create the regular expression dynamically, based on the length of the input-string (or on the last "x"). It could somehow look like this:
my $s = "x12345x23x56x";
my $len = (length($s)-3)/2;
my $re = "x" . "."x$len . "x" . "."x$len . "x"; # this is the RegEx
+you want
if ($s =~ /$re/) { print "ok\n"; } else {print "nok\n";}
HTH, Rata | [reply] [d/l] |
Re: Regular Expression: search two times for the same number of any signs
by Discipulus (Canon) on Nov 29, 2016 at 10:28 UTC
|
Hello,
you can play with length and more less greediness like in the following example (for sure while i'm writing you had received better answers)
use strict;
use warnings;
while (<DATA>){
chomp;
$_=~/x(.*?)x(.*)x$/;
if (length $1 == length $2){
print "OK $_\t [$1]",length $1," [$2]",length $2,"\n";
}
else{print "$_ NOT OK\t[$1]",length $1," [$2]",length $2,"\n";}
}
__DATA__
xxx
x.x.x
x12x..x
x123x...x
x123x.x.x
x12x1x
# out
OK xxx []0 []0
OK x.x.x [.]1 [.]1
OK x12x..x [12]2 [..]2
OK x123x...x [123]3 [...]3
OK x123x.x.x [123]3 [.x.]3
x12x1x NOT OK [12]2 [1]1
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] |
|
Very nice approach !
But it does not work when in the first any sign part is a "x"
"x1x2x...x" # should be valid is x.{3}3x.{3}x, but is NOK OK
| [reply] [d/l] |
|
use strict;
use warnings;
while (<DATA>){
chomp;
# note the string IS always odd
my $inter = int ((length $_) / 2)-1;
my @char = $_=~/./g;
if (scalar @char % 2 < 1){
print "Not OK $_ (unbalanced)\n";
next;
}
if (
$char[0] eq $char[$inter+1] and
$char[0] eq $char[-1] and
$char[0] eq 'x'
){
print "$_\t\tOK\n";
}
else {
print "NOT OK $_\t[$char[0] $char[$inter+1] $char[-1]]\n";
}
}
__DATA__
xxxxx
x1x2x...x
xxx
x.x.x
x12x..x
x123x...x
x123x.x.x
x12x1x
# out
xxxxx OK
x1x2x...x OK
xxx OK
x.x.x OK
x12x..x OK
x123x...x OK
x123x.x.x OK
Not OK x12x1x (unbalanced)
L*
UPDATE: it can be semplified, or golfed, a lot using 5.010
use strict;
use warnings;
use 5.010;
while (<DATA>){
chomp;
# note the string IS always odd
if ((length $_) % 2 < 1){
print "Not OK $_ (unbalanced)\n";
next;
}
if (($_=~/./g)[0,(int((length $_)/2)-1),-1]~~[qw(x x x)]){
print "$_\t\tOK\n";
}
else {
print "NOT OK $_\n";
}
}
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
Re: Regular Expression: search two times for the same number of any signs
by hippo (Bishop) on Nov 29, 2016 at 10:06 UTC
|
If I understand you correctly (and maybe not, see How to ask better questions using Test::More and sample data) you want this:
- Find the position of the first 'x'.
- Find the position of the second 'x'.
- Take the difference in these positions, add it to the second position and look here for the third 'x'.
- If all that succeeds you have a match.
In which case, just code this up with a loop and judicious use of index and substr.
However, this does sound rather like an XY Problem. Perhaps if you explained why you want to do this in the first place a much better solution might become apparent.
| [reply] |
|
| [reply] |
|
If that's really the case (and you are probably right) then maybe change the order: find the first and last occurrence of 'x', calculate where the middle one should be and look there?
However, the spec is a little woolly and the whole thing is still shouting "XY!" at me.
| [reply] |
|
Yes. You understand my problem perfectly.
This is also an option. To search for this pattern with a small program. (multiple searchs). The question was, can I do it also with a single regular expression. Using predefined variables like $1
| [reply] |
Re: Regular Expression: search two times for the same number of any signs
by tybalt89 (Monsignor) on Nov 29, 2016 at 15:31 UTC
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1176775
use strict;
use warnings;
print /x(.*)x(??{$1 =~ tr##.#cr})x/ ? 'pass' : 'fail', ' ', $_ while <
+DATA>;
__DATA__
xxxxx
x1x2x...x
xxx
x.x.x
x12x..x
x123x...x
x123x.x.x
x12x1x
outputs:
pass xxxxx
pass x1x2x...x
pass xxx
pass x.x.x
pass x12x..x
pass x123x...x
pass x123x.x.x
fail x12x1x
| [reply] [d/l] [select] |
Re: Regular Expression: search two times for the same number of any signs
by AnomalousMonk (Archbishop) on Nov 29, 2016 at 17:18 UTC
|
Here's another single-regex approach, although as others have said, I don't necessarily think such an approach is best. (Requires Perl version 5.10+.)
Code:
Output:
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |