comment on

ASCII strings may follow different paths through the code depending on whether the SVf_UTF8 flag is set, but the end results should be exactly the same. That makes it hard to maintain discipline as to whether the flag should be on or off, and in practice, you can't count on it being one way or the other.

If you have an all-Unicode application or subsystem, sometimes it makes sense to convert the string to an internal UTF8 representation at the boundary as it enters the subsystem, so that you don't have to continually run UTF-8 byte sequence validity checks to see whether the scalar is pure ASCII or contains high 8-byte code points. The easy way to do this is to turn the SVf_UTF8 flag on even if it's an ASCII string. One of my XS distros does this.

In reply to Re: Why is utf8 flag set after Encode::decode of pure ASCII? by creamygoodness
in thread Why is utf8 flag set after Encode::decode of pure ASCII? by brycen

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Welcome to the Monastery
	PerlMonks