comment on

soonix wrote the following, in reply to BillKSmith,

> if you use utf8 only for strings.... it's your decision

Actually, 1nickt wrote the code that BillKSmith was trying to run. It was not a decision on BillKSmith's part at all; he just downloaded code from perlmonks, expecting code from a longstanding monk to run without edits. And because perlmonks sends ISO-8859-1 encoding, not UTF-8, then code that is served as ISO-8859-1 will Save As a file encoded with ISO-8859-1. And then because there was a use utf8; in the code that perlmonks serves as ISO-8859-1, the perl executable gives the "Malformed UTF-8 character" message because of the mismatch between the file encoding and the pragma.

The best would be if perlmonks would serve posts and [download]s as UTF-8, or at least give us an option for it to do so. The next best is for the monk who [download]s the code to convert the file (whether by iconv or a perl oneliner¤ or by a text editor that can change a file's encoding) before running. The suggestion that requires the most effort so far would be for the monk who [download]s the code from perlmonks to have to search through every piece of code they download from perlmonks that has use utf8; and check to make sure that the code isn't actually relying on it, and either commenting out that pragma if it's not actually needed (as I hinted at earlier) or changing every non-ASCII character in a quote from the actual character to a named character.

¤: oneliner = perl -pi -MEncode=encode,decode -e "$_ = encode('utf-8', decode('iso-8859-1', $_));" save-as.pl

In reply to Re^4: Malformed UTF-8 character by pryrt
in thread Malformed UTF-8 character by BillKSmith

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


P is for Practical
	PerlMonks