http://qs321.pair.com?node_id=376805

december has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Could someone please explain the following output?

#!/usr/bin/perl -w use strict; use Encode; use Data::Dumper; my $string1 = "blëh"; my $string2 = "blëhh"; my $string3 = "blëh.txt"; $string1 = Encode::decode(utf8 => $string1); $string2 = Encode::decode(utf8 => $string2); $string3 = Encode::decode(utf8 => $string3); $Data::Dumper::Useqq = 1; print Dumper $string1, $string2, $string3; print "matches1\n" if ($string1 =~ /^[\w\s.]+$/); print "matches2\n" if ($string2 =~ /^[\p{Word}]+$/); print "matches3\n" if ($string3 =~ /^[\p{L}\p{M}\p{N}.]+$/); ##### output ##### $VAR1 = "bl"; $VAR2 = "bl\x{fffd}hh"; $VAR3 = "bl\x{fffd}h.txt"; matches1

(Perl version is 5.8.4)

As far as I can see, string1 should not match but does (look at the weird Dumper output), while string2 and string3 don't match, but should.