http://qs321.pair.com?node_id=1161620


in reply to Re^2: Comparing Lines within a Word List
in thread Comparing Lines within a Word List

Yes, you've correctly described the approach, which uses the shift function to extract the word that is currently at the beginning of the array, and then, if that first word contains "r" or "s", a regex is created and used with the grep function to search for matches in all the remaining words in the array.

One thing you didn't specify yet is what to do with sets like "cases / carer / caser / cares": Should the first one match all of the other three? Should the second one match both of the last two? Should the last two match each other? If the answer is "yes" on all points, then you'll want to create a different regex, which can be done using the split and map functions, and (my favorite from C) the "ternary" conditional operator:

my $model = shift @words; my $regex = join( "", map{ ( /[rs]/ ) ? "[rs]" : $_ } split( /([rs +])/, $model )); next if ( $regex eq $model ); # skip if model has no "r" or "s" my @hits = grep /^$regex$/, @words; ...
(BTW, maybe you already know, but /$regex/i (adding the "i" modifier at the end) does case-insensitive matches.)

(updated to add a missing paren at the end of the second line in the snippet -- also added the anchors around $regex in the grep call)