Combinatorics in Perl

wouldbewarrior has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Combinatorics in Perl by saintmike (Vicar) on Apr 30, 2004 at 08:06 UTC
If you're willing to give up the blanks, here's a rarely known feature of Perl's globbing permutation functionality: `for(<{ch}{i,}{ts}{a}{k}{u,}{s}{i,}#>) { print "$_\n"; }` [download] It prints: `chitsakusi# chitsakus# chitsaksi# chitsaks# chtsakusi# chtsakus# chtsaksi# chtsaks#` [download]	[reply] [d/l] [select]
Re: Combinatorics in Perl by tkil (Monk) on Apr 30, 2004 at 06:19 UTC
Here's one regex (plus while loop) for it: `while ( s{ ^ (.?) (\s \S+)_DEL (.) $ } {$1$3\n$1$2$3}xmg ) {}` [download] I am trying to cook up one using (?{ ... }) constructs, but not having as much luck. Update: Oops, left off the empty loop body... Update* (at 2004-04-29 23:49 -0700): Here is a more procedural solution that uses the same basic idea as above. It might be a bit more useful, as it returns a list of variants instead of one big string. (Would be interesting to benchmark them...) `sub del_selective { my $n = shift; if ( $n =~ m/(.?)(\s\S+)_DEL(.*)/s ) { my $s1 = $1.$3; my $s2 = $1.$2.$3; return ( del_selective( $s1 ), del_selective( $s2 ) ); } else { return ( $n ); } }` [download]	[reply] [d/l] [select]
Re: Re: Combinatorics in Perl by wouldbewarrior (Acolyte) on Apr 30, 2004 at 06:40 UTC
Wow, I like it! That was very well done. I have this great big Japanese transcription database that some other engineers and I have been trying to use to cook up a speech recognizer for the language, but the transcription database has a bunch of crap in it. One way to fix this, for the recognizer is just to give it all available options, wherever there is ambiguous 'crap'. We were having a lot of trouble with this silly problem today though. So thanks a lot!	[reply]
Re: Re: Combinatorics in Perl by pizza_milkshake (Monk) on Apr 30, 2004 at 07:10 UTC
argh, i worked for 2 hours and came up with nothing particularly useful. very nice job. perl -e'$_="nwdd\x7F^n\x7Flm{{llql0}qs\x14";s/./chr(ord$&^30)/ge;print'	[reply]
Re: Combinatorics in Perl by TimToady (Parson) on Apr 30, 2004 at 22:20 UTC
Well, you can get all fancy with recursive solutions, but if I were your teacher, I'd give the most credit to a solution that merely counts from 0 to 7 and uses the bit pattern to decide which things to include and which things to leave out. That's an algorithm you could probably get someone else to understand and maintain.	[reply]
Re^2: Combinatorics in Perl by tkil (Monk) on May 01, 2004 at 09:02 UTC
I find your lack of faith (in recursion) ...disturbing. With a bit more documentation, recursion or regular expressions are perfectly maintainable. I'll plead guilty to under-documenting my original response; one of the reasons I tucked it into a `/x` construct was to be able to add comments later. I'll note that you didn't post any code, though. I'd be curious to see your solution. Here are various of mine; which do you yourself think is most maintainable? #!/usr/bin/perl use strict; use warnings; sub int_1 ( $ ) { my $template = shift; # split template on the characters to switch in and out my @chunks = split /(\s\S+)_DEL/, $template; # make sure the odd elements are the switched chars. unshift @chunks, '' if $template =~ /^\S+_DEL\b/; # check for boundary conditions my $n_toggles = int( @chunks / 2 ); if ( !$n_toggles ) { return $template; } elsif ( $n_toggles > 10 ) { die "max of 10 toggles (found $n_toggles) in '$template'"; } my @rv; my $n_bits = ( 1 << $n_toggles ) - 1; foreach my $bits ( 0 .. $n_bits ) { my $s = ''; for ( my $i = 0; $i < @chunks; $i += 2 ) { $s .= $chunks[$i]; $s .= $chunks[$i+1] if $bits & 1; $bits >>= 1; } push @rv, $s; } return @rv; } my $MAX_BIT = 0x10000; my $ALL_BITS = $MAX_BIT - 1; sub int_2 ( $ ) { my $template = shift; # split template on the phonemes to switch in and out my @chunks = split /(\s\S+_DEL)/, $template; # mark which are toggles my @toggles; my $next_bit = 1; foreach ( @chunks ) { if ( s/_DEL$// ) { push @toggles, $next_bit; $next_bit <<= 1; die "too many toggles!" if $next_bit > $MAX_BIT; } else { push @toggles, $ALL_BITS; } } # anything to do? return $template if $next_bit == 1; my @rv; foreach my $bits ( 1 .. $next_bit ) { my $s = ''; for my $i ( 0 .. $#chunks ) { $s .= $chunks[$i] if $bits & $toggles[$i]; } push @rv, $s; } return @rv; } sub int_3 ( $ ) { my $template = shift; # split template on the phonemes to switch in and out my @chunks = split /(\s\S+_DEL)/, $template; # mark which are toggles my $next_bit = 1; foreach ( @chunks ) { if ( s/_DEL$// ) { $_ = [ $next_bit, $_ ]; $next_bit <<= 1; die "too many toggles!" if $next_bit > $MAX_BIT; } else { $_ = [ $ALL_BITS, $_ ]; } } # anything to do? return $template if $next_bit == 1; my @rv; foreach my $bits ( 1 .. $next_bit ) { push @rv, join '', map { $_->[0] & $bits ? $_->[1] : '' } @chu +nks; } return @rv; } sub re_1 ( $ ) { my @templates = @_; # yet another variant on the repeat-and-double-until-done while # loop. while (1) { my @new_templates; foreach ( @templates ) { if ( m/(.?)(\s\S)_DEL(.)/ ) { push @new_templates, $1.$2.$3, $1.$3; } } return @templates unless @new_templates; @templates = @new_templates; } } sub re_2 ( $ ) { my $template = shift; # make sure it ends with exactly one newline $template =~ s/\s$/\n/; # make all the breaks unique: { my $i = 0; $template =~ s/(_DEL\b)/$1 . ++$i/ge; } # for each break, replace template with two copies of itself, one # with the replacement, one without: while ( $template =~ /((\s\S+)_DEL(\d+)\b)/ ) { my ( $break, $rep ) = ( $1, $2 ); my ( $with, $without ) = ( $template ) x 2; $with =~ s/\Q$break/$rep/g; $without =~ s/\Q$break//g; $template = $with . $without; } # break back into individual chunks. (trailing \n should be # handled gracefully.) return split /\n/, $template; } sub re_3 ( $ ) { my $template = shift; # basic idea is to find every line with a blah_DEL in it, then # replace that line with two lines: one with "blah" inserted, one # without. Since there can be multiple _DEL directives per line, # we need to iterate until they're all gone. do {} while $template =~ s{ ^ (.?) (\s \S+)_DEL\b (.*) $ } {$1$3\n$1$2$3}xmg; # break back into individual chunks. (trailing \n should be # handled gracefully.) return split /\n/, $template; } while ( my $template = <DATA> ) { $template =~ s/\s+$//; print "=== template:\n$template\n"; print "--- int_1:\n", map "$_\n", int_1 $template; print "--- int_2:\n", map "$_\n", int_2 $template; print "--- int_3:\n", map "$_\n", int_3 $template; print "--- re_1:\n", map "$_\n", re_1 $template; print "--- re_2:\n", map "$_\n", re_2 $template; print "--- re_3:\n", map "$_\n", re_3 $template; } exit 0; __DATA__ ch i_DEL ts a k u_DEL s i_DEL # [download]	[reply] [d/l] [select]
Re: Combinatorics in Perl by wouldbewarrior (Acolyte) on Apr 30, 2004 at 08:20 UTC
wow, that too is an extremely elegant little solution. I like it. Thanks!	[reply]
Re: Combinatorics in Perl by flyingmoose (Priest) on Apr 30, 2004 at 14:07 UTC
This problem seems rather arbitrary ('i_DEL', etc) -- any chance this is a homework problem? Either way, the damage is done --- but if it's homework, it is always a good practice to announce that you are asking for help with homework when posting the question. If I'm wrong, accept my apologies.	[reply]
Re: Re: Combinatorics in Perl by wouldbewarrior (Acolyte) on May 20, 2004 at 04:09 UTC
While it does appear rather arbitrary, it wasn't a homework problem. I work at the Panasonic Speech Technology Research Lab in Santa Barbara (now PDNL for a number of asinine and annoying reasons) and we were in the process of developing a japanese recognizer (hence the Japanese exaMPLE), however, there was a problem with the input lexicon we were using to train the recognizer. The transcription lexicon did not reflect several important phonological idiosyncracies of japanese, which happen not to be predictable. The best solution for us was to tag the phonemes that were susceptible to the idiosyncracies and then produce all possible variations, as this ensured that the correct model would get into the recognizer. Unfortunately nobody had the perl expertise to do this, but once we realized this we thought we'd like to find out what clever solution might be waiting out there...	[reply]


Think about Loose Coupling
	PerlMonks