if i actually type a micron into the string using Alt-0181 then i get the following output...
Apparently, your editor is operating in ISO-Latin1 mode and is
entering the micron as a single byte (181 decimal = B5 hex).
You're then
telling Perl that this string is UTF-8 (i.e. the decode("utf8",$clob) statement from oshalla's code),
which is incorrect. For this reason, the conversion (silently) fails
and the incorrect part (B5 does not start a valid UTF-8 encoding
sequence here) is being replaced by the unicode replacement character
U+FFFD, which when encoded as UTF-8 produces the three-byte sequence EF BF BD.
When you interpret/display those three bytes as ISO-Latin1 characters they appear
as "�", i.e. ï = EF, ¿ = BF, ½ = BD. This is how I (and I suppose
everyone else, too) see them in your post, because the PM site isn't
unicode aware. If your terminal displays those same three characters,
this just means it isn't unicode aware either...
IOW, everything behaves as expected. :)