http://qs321.pair.com?node_id=389762

bronto has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. This should be a simple problem, but it's 6PM here, and I think that my brain just turned itself off. But I'd like to go home with the problem solved

I'd like to have a regex to match against a UN*X aliases file; the regex should match if a line begins by root: but the string that follows it is different from, say, myself@domain.of-my.own

In short:

I know it is simple, damn; I surely already wrote something like that... but that regex doesn't want to leave my brain for my keyboard...

Any help?

Thanks a lot in advance

Update: I have to feed the regexp into an editfiles section of cfengine; I need to do the job with a single, standard regex

Update: Unfortunately cfengine has no support for PCREs, so if the problem has no solution with plain-old regexes (does it?), I'll have to change the way I work with the problem (by, for example, spawning an external perl -i.bak one-liner. Thanks to everyone of you that took the time to suggest a solution!

Update: Actually, the people of the cfengine have already considered linking against libpcre instead of libre, but nobody come out with a working patch for both the source and the compiler's options. If any monk is able to, that could be great for both cfengine (a great improvement IMHO) and Perl (another conquer, Captain! :-)

Ciao!
--bronto


The very nature of Perl to be like natural language--inconsistant and full of dwim and special cases--makes it impossible to know it all without simply memorizing the documentation (which is not complete or totally correct anyway).
--John M. Dlugosz

Replies are listed 'Best First'.
Re: Regex matching a string beginning with "root: " but not containing another string
by ikegami (Patriarch) on Sep 09, 2004 at 16:22 UTC

    Use a negative lookahead assertion:

    @data = split(/\n/, <<'__EOI__'); root: myself@domain.of-my.own anotherone: myself@domain.of-my.own root: yetanother1@domain.of-my.own __EOI__ $email = 'myself@domain.of-my.own'; foreach (@data) { /^root: (?!\Q$email\E)(\S+)/ && print("$1\n"); } __END__ output ====== yetanother1@domain.of-my.own
•Re: Regex matching a string beginning with "root: " but not containing another string
by merlyn (Sage) on Sep 09, 2004 at 16:22 UTC
      That would not work correctly given notmyself@domain.of-my.own.doing. You need to put whatever characters (or \z) delimit the email address around the \Q \E.
Re: Regex matching a string beginning with "root: " but not containing another string
by Roy Johnson (Monsignor) on Sep 09, 2004 at 16:29 UTC
    You can do it with negative lookahead:
    /^root: (?!\Qmyself@domain.of-my.own\E)/;
    You could also do it with two regexps:
    /^root: /g and !/\G\Qmyself@domain.of-my.own\E/g;

    Caution: Contents may have been coded under pressure.
Re: Regex matching a string beginning with "root: " but not containing another string
by Eimi Metamorphoumai (Deacon) on Sep 09, 2004 at 18:48 UTC
    It can be done with just a standard regexp, but it's monumentally ugly. If you want to match "root:" not followed by "bar", you get
    /root: #need root ($ #it could end there |([^b] #if not, first letter isn't b |b($|[^a] #or it can be b if it's not |a($|[^r]) #followed by ar /x
    (Untested, I may have gotten it a little wrong.) Suffice to say, it's ugly, but it's doable. You just have to make sure that you always allow either the end of the string ($), any single character but the next, or the next character, as long as it itself isn't followed by (the rest of the string). Writing that for something long and complicated is a nightmare (unless someone has written a script to do it, which I don't know of).

      Here's a different way to do it. IMHO, the regex it produces will be much easier and make much more sense.

      use strict; my $email = "myself\@domain.of-my.own"; my $string = "root: myself\@domain.of-my.own"; my $re = inv_str_seq($email); $string =~ /^root: $re/ and print 1; sub inv_str_seq { my($string) = @_; my @c = map { quotemeta($_) } split '', $string; my $re = "\n [^$c[0]] \n"; $re .= " | " . join('',@c[0..$_-1]) . " [^$c[$_]]\n" for (1 .. $#c); $re = qr/$re/x; return qr/$re+ (?:[^$c[0]]|$)/x; }

      Update: Moved the quotemeta up.

      Here's a little sub to generate such a regex from the string you want not to match:
      sub not_string_regex { my $str = shift; my $reg; for(0..length($str)-1) { my $let = substr($str,$_,1); $reg .= '($|[^'.$let.']|'.$let; } substr($reg, -2) = ''; $reg .= ')' x length($str); }

      Caution: Contents may have been coded under pressure.