Re: Regex question once-only use of chars in a charset

in reply to Regex question once-only use of chars in a charset

I think it can be as simple as this:

use Modern::Perl;

open my $WORDLIST, '<', './wordlist.txt' or die $!;

my $available = 'AABCDEF';

$available = join '?', sort split '', $available;
$available .= '?';

while (<$WORDLIST>) {
    chomp;
    my $sorted = join '', sort split '';
    say if $sorted =~ /^$available$/io;
}
[download]

Running this script with your 'AABCDEF' gives me the following results:

abe
ace
aced
baa
bad
bade
be
bead
bed
cab
cad
cade
cafe
dab
dace
deaf
deb
decaf
fab
facade
face
faced
fad
fade
fed
[download]

I use a 58,000 elements wordlist and it needed less than a few seconds to generate this result.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Comment on Re: Regex question once-only use of chars in a charset Select or Download Code

Replies are listed 'Best First'.
Re^2: Regex question once-only use of chars in a charset by John M. Dlugosz (Monsignor) on May 15, 2011 at 13:41 UTC
Cool! Sorting the word as well is a game changer! Then you just check off the letters in order.	[reply]
Re^3: Regex question once-only use of chars in a charset by CountZero (Bishop) on May 15, 2011 at 18:51 UTC
It is an old trick. Transform both sides of the comparison to a canonical form and the whole problems becomes much easier to solve. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^2: Regex question once-only use of chars in a charset by elef (Friar) on May 16, 2011 at 10:52 UTC
What's the o in `say if $sorted =~ /^$available$/io;`? I know i is for case-insensitive matching, but I can't find an o modifier in the documentation.	[reply] [d/l]
Re^3: Regex question once-only use of chars in a charset by AnomalousMonk (Archbishop) on May 16, 2011 at 13:05 UTC
The `/o` modifier means "compile once" for a regex it modifies. Consider these examples: `>perl -wMstrict -le "my $s = '1a2b3c'; ;; print qq{no /o}; for my $i (qw(3 2 1)) { print qq{matched '$1'} if $s =~ m{ ($i.) }xms; } ;; print qq{with /o}; for my $i (qw(3 2 1)) { print qq{matched '$1'} if $s =~ m{ ($i.) }xmso; } " no /o matched '3c' matched '2b' matched '1a' with /o matched '3c' matched '3c' matched '3c'` [download] The function of the `/o` modifier has been generally replaced by the `qr//` regex object builder (see in perlop). I was a bit surprised not to see anything about `/o` in perlre, but it is (briefly and obliquely) discussed in qr/STRING/msixpodual (5.14), and the following remains in perlretut (at least through 5.12): Optimizing pattern evaluation We pointed out earlier that variables in regexps are substituted before the regexp is evaluated: `$pattern = 'Seuss'; while (<>) { print if /$pattern/; }` [download] This will print any lines containing the word "Seuss". It is not as efficient as it could be, however, because Perl has to re-evaluate (or compile) $pattern each time through the loop. If $pattern won't be changing over the lifetime of the script, we can add the "//o" modifier, which directs Perl to only perform variable substitutions once: `#!/usr/bin/perl # Improved simple_grep $regexp = shift; while (<>) { print if /$regexp/o; # a good deal faster }` [download]	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom