I'm not entirely sure what you are aiming at here. If you want to match string which has the same 4-char substring X at both beginning and end, this should do the trick:
/^(.{4}).*\1$/ # ^ and $ match beginning and end of string respect
+ively
However, it's not clear what you mean by saying you want to match "similar circumstances in the middle". If that means that the string is repeated somewhere in the middle you could use:
/^(.{4}).*\1.*\1$/
As for the rest, I'd been much easier if you had posted sample input data so we'd know what you meant by allowing for a miss. I'll assume that you have a string of four characters and want three or more of the characters to match and that length and order matters. Like this:
ABCD # pattern
ABCD # match
ACDB # no match
ABED # match
BBCD # match
ABC # no match
The most obvious way to do it I could think of is to use a regexp like this:
/(.BCD|A.BCD|AB.D|ABC.)/. The following sub compiles this pattern for you:
sub makepattern
{
my $str = shift;
my @res;
for( $i = 0; $i < length( $str ); $i++ ) {
push @res, substr( $str, 0, $i ) . "." . substr( $str, $i + 1, l
+ength( $str ) - $i );
}
return join( "|", @res );
}
# example
my $imprecisematch = "ABCD";
my $pattern = makepattern( $imprecisematch );
/^($pattern).*\1.*\1$/ #match $_ against three occurrences of this
+ pattern
Cheers,
--Moodster