Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Problem with regex wildcard operator (.)...?

by Marshall (Canon)
on Sep 06, 2021 at 07:25 UTC ( [id://11136497]=note: print w/replies, xml ) Need Help??


in reply to Problem with regex wildcard operator (.)...?

I really don't know the requirements for your application.
I have taken an imperfect attempt at it below.

I did come close to your output for "s." but not exactly.
My output for "a." is wildly off from what you show.

I downloaded your DB file to a local file to make my testing faster.
I don't think that makes any difference.

I would like more text to describe what you want to have happen.

#!/usr/bin/perl use strict; use warnings; my @words; open my $fh, '<', 'EuroScrabbleWordList.txt' or die "can't open word l +ist $!"; foreach (<$fh>) { tr/\r\n//d; tr/A-Z/a-z/; next if /\s/; # dirty way of removing comments push @words, $_; } close $fh; ############################################ my $tiles = 'a.'; # print all words with 2 letters that contain "a" print "\nMATCHES FOR: $tiles\n"; my @m = find_matches($tiles); print "\n@m\n"; $tiles = 's.'; # print all words with 2 letters that contain "s" print "\nMATCHES FOR: $tiles\n"; @m = find_matches($tiles); print "\n@m\n"; $tiles = 'a.a'; #print all words with letters that contain 2 a's print "\nMATCHES FOR: $tiles\n"; @m = find_matches($tiles); print "\n@m\n"; ############################################ sub find_matches { my $pattern = shift; my $max_chars = length $pattern; my @matches; $pattern =~ s/\W+//g; #delete the dots my $raw_letters = $pattern; my $regex = ""; $regex .= "[$raw_letters].?" for (1..length $raw_letters); print "$regex\n"; foreach my $word (@words) { next if (length ($word) > $max_chars); push (@matches, $word) if $word =~ /$regex/; } return @matches; } __END__ MATCHES FOR: a. [a].? aa ab ad ae ag ah ai al am an ar as at aw ax ay ba da ea fa ha ja ka l +a ma na pa ta ya za MATCHES FOR: s. [s].? as es is os sh si so st us MATCHES FOR: a.a [aa].?[aa].? aa aah aal aas aba aga aha aia aka ala ama ana aua ava awa baa caa faa + maa
UPDATE:
I don't claim that my code is a general solution to your problem.
In fact, I think that many enhancements are necessary! This was just minimal code to answer some simple questions.

In terms of algorithms, let's start with something simple:
for "s.":

You say: as es is os sh si so I say: as es is os sh si so st us
I am completely unable to understand why st and us should be missing from the output? Please explain.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11136497]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2024-03-28 10:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found