If you use properly decoded strings (which you do, since use utf8; is in effect) and no locales, \w, \d etc. follow Unicode semantics, which means they match more than the basic Latin characters.
I'm not very familiar with locales, but I guess that it expects the strings to be non-decoded binary strings in the encoding specified in the locale (here: UTF-8), so it might work without the utf8 pragma.
In general I recommend against locales, if you can avoid them. In my experience they are always a source of trouble, and don't bring the promised "do what I mean"-effect.