Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Regex help

by BrowserUk (Patriarch)
on Jun 23, 2007 at 02:46 UTC ( [id://622942]=note: print w/replies, xml ) Need Help??


in reply to Regex help

Update: It does work when I get the syntax of the negative lookahead assertion correct. (?!...) not (?!=...)!

Try this:

perl -nle"/^(.)(?!\1)(.)(?!\1|\2)(.)(?!\1|\2|\3)(.)(?!\1|\2|\3|\4)(.)\ +3(?!\1|\2|\3|\4|\5)(.)(?!\1|\2|\3|\4|\5|\6).$/ and print" <words

Expanded out that regex is:

/^ ## With the $, exactly 8 chars only (.) ## Any char as \1 (?!\1)(.) ## Any char as \2 except \1 (?!\1|\2)(.) ## Any char as \3 except \1 or \2 (?!\1|\2|\3)(.) ## Any char as \4 except ... (?!\1|\2|\3|\4)(.) ## Any char as \5 except ... \3 ## Only whatever is in \3 (?!\1|\2|\3|\4|\5)(.) ## Any char as \6 except ... (?!\1|\2|\3|\4|\5|\6). ## Any char except any char we've alre +ady seen $/x

That begs to be generated from some kind of shorthand spec. and actually, you used such a spec in your question.

'ABCDECFG' is a perfect spec. once you think of those letters as placeholders rather than literal characters.

Update: And here is a generator:

#! perl -slw use strict; my $spec = shift or die 'No spec supplied!'; my $re = ''; my %tally; my $i = 1; for my $c ( split '', $spec ) { if( exists $tally{ $c } ) { $re .= '\\' . $tally{ $c }; } else { if( $i == 1 ) { $re .= '(.)'; } else { $re .= '(?!' . join( '|', map{ '\\' . $_ } values %tally ) . ')(.)'; } $tally{ $c } = $i++; } } $re = qr[^$re$]; print $re; m[$re] and printf $_ while <>; __END__ C:\test>wordSolver ABCDECFG words (?-xism:^(.)(?!\1)(.)(?!\1|\2)(.)(?!\1|\3|\2)(.)(?!\1|\4|\3|\2)(.)\3(? +!\1|\4|\3|\5|\2)(.)(?!\6|\1|\4|\3|\5|\2)(.)$) abednego abscised airborne ... whisking worker's writhing ziegfeld

I wonder if the golfers could reduce that to a one-liner?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Regex help
by Anonymous Monk on Jun 24, 2007 at 03:01 UTC
    Tried the very first one. Worked beautifully!

    Thank you :)

Re^2: Regex help
by graff (Chancellor) on Jun 29, 2007 at 03:53 UTC
    I don't know about golfing it down, but in terms of presenting a human-readable regex, it would somehow seem more intelligible to me if the whitespace were arranged differently:
    /^ (.)(?!\1) # any character not followed by itself (.)(?!\1|\2) # any character not followed by itself or previous (.)(?!\1|\2|\3) # likewise unique... (.)(?!\1|\2|\3|\4) (.) \3 (?!\1|\2|\3|\4|\5) # 6th character must be same as 3rd (.)(?!\1|\2|\3|\4|\5|\6) # 7th and 8th must be unique . $/x
    (Admittedly, it's still a bit of a mind-bender.)

    Of course, having that handy regex generator makes the regex formatting and commentary a moot point -- so much nicer to allow people to use the simple "rhyme-scheme" alphabetics for specifying the target pattern, and keep the actual regex syntax a purely "script-internal" detail, hidden from human eyes.

      ... , but in terms of presenting a human-readable regex, ...

      I had 3 attempts at documenting it. The one I presented was the least bad to me, as the programmer that constructed it.

      Once you see the pattern, it's pretty clear how it works and how to extend it to other situations. What I attempted to do was make the pattern clear.

      But that relates back to a long held belief that the guy that writes the code is the worst possible person to document it. I was fortunate enough to work with the services of a s***-hot technical author (actually 3 over the years, one guy and two women), for a long period. His skill was as much in being able to 'forget' his (self-described, limited) programming skills and so ask questions from the perspective of someone with no knowledge, as it was in his writing. His very capable writing skills, and ability to phrase things clearly and concisely, were just the icing on the cake. His inate ability to tease out the detail that mattered and ignore what I (as programmer, designer or architect) though was important (today, this minute, because I just solved the problem) was far more invaluable.

      I highly commend and recommend the idea of adding a competent technical author to any team of more than 5 programmers, if you want your documentation to be produced, on time, on budget and in a usable and useful manner. The salary cost of an English Lit, major with a CS minor and a couple of years of exposure to development environments and technical documentation is approximately the same as a CS grad with one year, post grad experience--but the time they will save you, and the quality it will add to your development processes, are worth several times that.

      Pick the right person, with the right mix of 'people skills' and (metaphoric) balls to not take s*** from developers and managers who think that their part of the process is more important than the TAs. And endow them with sufficient authority from the get-go to allow them to 'pull rank' on deadlines, when the nicely, nicely reminders approach fails--and they will be a valuable asset far exceeding their cost.

      In a small team that might struggle to find budget for a dedicated TA, you can often find one that will also use their, usually strong organisational and documentary skills, to organise and perform a lot of the day to day housekeeping chores--scheduling, timesheet keeping, minute taking, meeting organisation, checkpoint noting, chasing and documenting; even cardboard programmer(ing?:) when the need arises. In that way, they can allow developers to spend more of their time developing, and less time doing non-programming chores they hate doing and so usually put off until absolutely forced to do them--and then do them badly. Overall, they can be a huge time and money saver. As always with personnel issues, getting the right man or woman for the job is essential.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://622942]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (9)
As of 2024-03-28 12:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found