http://qs321.pair.com?node_id=630448


in reply to To find Cyrillic characters - unicode

The regex unicode block property '\P{InCyrillic}' will get you what you want. You may need to open the file in ':utf8' mode.

Isolating your match to particular xml elements will require one of the XML modules. That ought to make the text utf8 by default, but old perls may be idiosyncratic about that.

After Compline,
Zaxo