Are you sure the string is actually unicode? I think you interpreted every byte of your input as a character instead of interpreting your input as a unicode string. I'm not sure how to convert from one to the other.
Maybe it depends on some OS settings?
I doubt it. Try the following code.
# $str = Approximation of "Eric" in Katakana.
$str = "\x{30A8}\x{30FC}\x{30EA}\x{30AF}";
print(length($str), "\n"); # 4
$ch = substr($str, 0, 1);
print(($ch eq "\x{30A8}" ? "equal" : "not equal"), "\n"); # equal
|