Re: Puzzle Regex: Letter Frequency Arithmetic Sequence
by choroba (Cardinal) on Oct 17, 2017 at 15:31 UTC
|
Not sure about regexes, but counting frequencies means hashes.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $dict = '/var/lib/dict/words'; # YMMV
sub each_char_has_different_freq {
my ($r, $f) = @_;
keys %$r == keys %$f
}
sub frequencies_form_a_sequence {
my ($r) = @_;
keys %$r == grep $r->{$_}, 1 .. keys %$r
}
my @linenlessnesses;
open my $IN, '<', $dict or die $!;
while (<$IN>) {
chomp;
next if /\W/;
my %freq;
$freq{$_}++ for split //;
my %r = reverse %freq;
push @linenlessnesses, $_
if each_char_has_different_freq(\%r, \%freq)
&& frequencies_form_a_sequence(\%r);
}
say for sort { length $a <=> length $b } @linenlessnesses;
Interesting output (the longest words) on my machine:
deadheaded
evennesses
keennesses
peppertree
rememberer
restresses
sanenesses
sereneness
sleeveless
sussararas
susurruses
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord
}map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
| [reply] [d/l] [select] |
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1201500
use strict;
use warnings;
my $file = '/usr/share/dict/words'; # YMMV
open my $fh, '<', $file or die "$! opening $file";
while(<$fh>)
{
/^[a-z]{6,}$/i or next;
my $word = $_;
my @counts;
$counts[ s/$&//gi ] .= $& while s/.//;
grep(!$_ || /../, @counts) or print $word;
}
| [reply] [d/l] |
|
I played a lot with your solution to understand it well, so first of all - thank you, choroba.
Just my two cents: your function frequencies_form_a_sequence assumes that the frequencies start with one. The words with letter sequencies 2,3,4 etc. will not be selected then.
Should it not be in the following form (min, max and all are from List::Util and List::MoreUtils):
sub frequencies_form_a_sequence
{
my ($r) = @_;
my $min = min keys %$r;
return 0 if $min == 1; # Just for test.
my $max = max keys %$r;
return 0 if $min == $max; # Probably not a sequence.
my $bool = all { defined $r->{$_} } $min .. $max;
}
I find the following words then (cannot pretend to know what they mean though :) ):
addda
ajaja
alala
anana
arara
cocco
essee
esses
igigi
nanna
pappa
peepe
reree
susus
taata
tatta
ululu
xxiii
deedeed
sasararas
sassarara
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
Re: Puzzle Regex: Letter Frequency Arithmetic Sequence
by LanX (Saint) on Oct 17, 2017 at 15:42 UTC
|
The link you gave shows an example but doesn't phrase criteria, which is unfortunate for a puzzle.
Please see
How do I post a question effectively?
Supposing the frequency of letters has to be an ascending sequence. ..
I'm not aware of a possibility to sum the count of matches in pure ² m// regexes so probably it's possible in tr// but those can't backtrack.
Probably if you code 1..n matches for a fixed n into your regex and combine n look-aheads you can built an and condition. *
This wouldn't work for arbitrary big n though.
*) that is find one letter exactly once and find one letter exactly twice and so on.
already the first term is too tricky for me...
²) no Perl embedded
update
First step: What is a regex to find one letter exactly once?
update
Waiting for tybalt89 ... tick tock tick tock ... ;) | [reply] [d/l] [select] |
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1201500
use strict;
use warnings;
my @words = glob '{i,x}' x 6;
print "@words\n";
my $n = 1; # number of times a letter occurs in a word
my @oneletter = grep
/^(?|
(?=.*?(.)(?!.*\1))
(?:
(?:(?!\1).)*
\1
){$n}
(?:(?!\1).)*
|
(?=.*(.))
(?:
(?:(?!\1).)*
\1
){$n}
(?:(?!\1).)*
)$
/x, @words;
print "\n@oneletter\n\n", scalar @oneletter, "\n";
Update: fails for some test cases
| [reply] [d/l] |
|
Can you share some interesting test cases?
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] |
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1201500
use strict;
use warnings;
my $file = '/usr/share/dict/words';
open my $fh, '<', $file or die "$! opening $file";
chomp( my @words = grep /^[a-z]{5,}$/, <$fh>);
my @good = grep frequencysequence($_), @words;
print "@good\n\n" . @good . "\n";
sub frequencysequence
{
for my $n ( 1 .. ((-1 + sqrt 1 + 8 * length) / 2) =~ s/^(\d+)\..*/$1
+ + 1/er )
{
"@_\n@_" =~ /
(.).*
\n
(?: (?:(?!\1).)* \1 ){$n}
(?:(?!\1).)*
$ /x or return 0;
}
return 1;
}
On my system outputs:
acacia allele assays banana baobab bedded bonobo bowwow cocoon deadhea
+ded deemed doodad eddied eerier effete fesses heeded horror hubbub in
+ning lessee lollop mammal manana messes needed papaya peeped peeper p
+epped pepper pippin powwow reefer revere rococo salsas seeded senses
+sereneness settee sleeveless tattoo teeter teethe wedded weeded xxvii
+i xxxvii
49
| [reply] [d/l] [select] |
|
Nice solution of the "variable length lookbehind" not allowed problem!
But why \n instead of $ ? *
edit
I suppose it's necessary because the following assertion wouldn't be executed?
update
*) well, I just noticed the "@_\n@_" part where you are doubling the input, which explains the "\n" and some other confusion about pos.
I agree, "too much cheating!" ;-p
| [reply] [d/l] |
|
...sum the count of matches...
There's always the brute force method of generating every possible regex, assuming arithmetic sequence starting at 1. For a max frequency of 6, this covers 21 letter words. It makes me wonder what the breaking point is of my machine.
Note that in (3), only lengths that are triangular will do.
I'll have to find some time to chase this -- perhaps tomorrow.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] |
|
I just remembered that Perl supports recursive regexes and relative back-references, so you might not be limited to a hard limit.
Unfortunately I don't have the time to play with it.
FWIW if using s///g was allowed you just needed a pattern which repeatedly deleted one character of each group, iff exactly one of them is unique.
Your criterion was met, if the string is empty afterwards.
| [reply] |
Re: Puzzle Regex: Letter Frequency Arithmetic Sequence
by kcott (Archbishop) on Oct 18, 2017 at 06:06 UTC
|
#!/usr/bin/env perl
use strict;
use warnings;
use constant MIN_LENGTH => 10; # 1+2+3+4
RECORD:
while (<>) {
next RECORD unless MIN_LENGTH < length;
next RECORD unless /^([a-z]+)$/;
my %char_count_of;
$char_count_of{substr $1, $_, 1}++ for 0 .. length($1) - 1;
my $check = 1;
for (sort { $a <=> $b } values %char_count_of) {
next RECORD unless $_ eq $check++;
}
print;
}
Looking for words with ten characters or more:
$ pm_1201500_dict_char_ordered_count.pl /usr/share/dict/words
beerbibber
chachalaca
isoosmosis
kotukutuku
rememberer
sereneness
sleeveless
Changing MIN_LENGTH:
use constant MIN_LENGTH => 6; # 1+2+3
I get all of those ten-character words, interspersed amongst over a hundred six-character words, starting with
$ pm_1201500_dict_char_ordered_count.pl /usr/share/dict/words | head -
+5
allele
amamau
ananas
ananda
annona
and ending with
$ pm_1201500_dict_char_ordered_count.pl /usr/share/dict/words | tail -
+5
ubussu
venene
wedded
weeded
weewow
| [reply] [d/l] [select] |
Re: Puzzle Regex: Letter Frequency Arithmetic Sequence -- oneliner
by Discipulus (Canon) on Oct 18, 2017 at 08:33 UTC
|
ThereIsMoreThanOne.. oneliner to do this!
Me too I'd use hash instead of regexes.
The problem is that you'll end with deadheaded rococo messes or with at least with an ugly fruit salad of ananas patata papaya batata acacia pepper
perl -MList::Util="all" -lne "next unless/.../;my(%r,$c);$r{$_}++for(/
+./g);print if all{++$c==$_}sort values %r" linux.words
# or minimum 10 char words only
perl -MList::Util="all" -lne "next unless/.{10,}/;my(%r,$c);$r{$_}++fo
+r(/./g);print if all{++$c==$_}sort values %r" linux.words
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
Re: Puzzle Regex: Letter Frequency Arithmetic Sequence
by tybalt89 (Monsignor) on Oct 20, 2017 at 14:26 UTC
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1201500
use strict;
use warnings;
$| = 1;
my $file = '/usr/share/dict/words';
open my $fh, '<', $file or die "$! opening $file";
while(<$fh>)
{
chomp;
/[^a-z]/ and next;
"$_\n$_" =~ /^
(?= .* (.) .* \n (?: (?:(?!\1).)* \1 ){1} (?:(?!\1).)
+* $ )
(?= .* (.) .* \n (?: (?:(?!\2).)* \2 ){2} (?:(?!\2).)
+* $ )
(?= .{3,3} \n | .* (.) .* \n (?: (?:(?!\3).)* \3 ){3} (?:(?!\3).)
+* $ )
(?= .{3,6} \n | .* (.) .* \n (?: (?:(?!\4).)* \4 ){4} (?:(?!\4).)
+* $ )
(?= .{3,10} \n | .* (.) .* \n (?: (?:(?!\5).)* \5 ){5} (?:(?!\5).)
+* $ )
(?= .{3,15} \n | .* (.) .* \n (?: (?:(?!\6).)* \6 ){6} (?:(?!\6).)
+* $ )
(?= .{3,21} \n | .* (.) .* \n (?: (?:(?!\7).)* \7 ){7} (?:(?!\7).)
+* $ )
(?= .{3,28} \n | .* (.) .* \n (?: (?:(?!\8).)* \8 ){8} (?:(?!\8).)
+* $ )
(?= .{3,35} \n )
/x and print "$_ ";
}
print "\n";
| [reply] [d/l] |
|
You missed the one letter words like "a".
| [reply] |
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1201500
use strict;
use warnings;
$| = 1;
my $file = '/usr/share/dict/words';
open my $fh, '<', $file or die "$! opening $file";
while(<$fh>)
{
chomp;
/[^a-z]/ and next;
"$_\n$_" =~ /^
(?= .* (.) .* \n (?: (?:(?!\1).)* \1 ){1} (?:(?!\1).)
+* $ )
(?= .{1,1} \n | .* (.) .* \n (?: (?:(?!\2).)* \2 ){2} (?:(?!\2).)
+* $ )
(?= .{1,3} \n | .* (.) .* \n (?: (?:(?!\3).)* \3 ){3} (?:(?!\3).)
+* $ )
(?= .{1,6} \n | .* (.) .* \n (?: (?:(?!\4).)* \4 ){4} (?:(?!\4).)
+* $ )
(?= .{1,10} \n | .* (.) .* \n (?: (?:(?!\5).)* \5 ){5} (?:(?!\5).)
+* $ )
(?= .{1,15} \n | .* (.) .* \n (?: (?:(?!\6).)* \6 ){6} (?:(?!\6).)
+* $ )
(?= .{1,21} \n | .* (.) .* \n (?: (?:(?!\7).)* \7 ){7} (?:(?!\7).)
+* $ )
(?= .{1,28} \n | .* (.) .* \n (?: (?:(?!\8).)* \8 ){8} (?:(?!\8).)
+* $ )
(?= .{1,35} \n )
/x and print "$_ ";
}
print "\n";
On my machine, this outputs:
a aah acacia add aha aka all allele app arr ass assays b baa banana ba
+obab bbl bedded bee bib bob bonobo boo bowwow brr bub c cocoon coo d
+dad dds deadheaded deemed did doodad dud e ebb eddied eek eel eerier
+eff effete egg eke ell ere err eve ewe eye f fee fesses foo g gag gee
+ gig goo h heeded hmm horror hubbub huh i ill inn inning j k l lee le
+ssee lii lollop loo m mam mammal manana messes mom moo mum n nee need
+ed non nun o odd off oho ooh opp p pap papaya pee peeped peeper pep p
+epped pepper pip pippin poo pop powwow ppm ppr pup q r reefer revere
+rococo s salsas see seeded senses sereneness settee shh sis sleeveles
+s sqq ssh t tat tattoo tee teeter teethe tit too tot tut u usu v vii
+w wedded wee weeded woo wow x xii xix xxi xxv xxviii xxxvii y z zoo
| [reply] [d/l] [select] |
|
Re: Puzzle Regex: Letter Frequency Arithmetic Sequence
by LanX (Saint) on Oct 20, 2017 at 14:38 UTC
|
Newer Perls have some fancy features in the regex engine (apart from embedding Perl), so probably it could be solved.
But be warned, I wouldn't be surprised if a solution to this "puzzle" would proof the engine to be Turing complete.
| [reply] |