Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

That's a good job in tracking that down to the root cause!

When I wrote my previous response, I failed to check the version history of MIME::Lite::TT::HTML. Otherwise I would not made the assumption that the module does the right thing. It does not, as you found out. The current release is from 2007 (Perl 5.10-ish), so Unicode support was not only rather new and sometimes bumpy in Perl, but also module authors didn't have much experience with it, nor did all CPAN modules support it.

After having looked into the module's source code: The module works with all input in byte-encoded form. Today this is considered bad practice since it breaks a lot of Perl's string processing features, including those available from Template Toolkit. The module also assumes that the subject is encoded, in the same encoding as the template files, which is even more questionable. So yes, patching (or subclassing) the module's methods encode_subject and encode_body would be the way to go. Filing an issue for the module would also be fine, but according to the current list of open issues it doesn't look like the auther is still actively maintaining the module.

There is no keyword for Perl's internal encoding (because, by definition, these strings are decoded). So you could either invent one like *internal* or even us an undefined value as an indicator that your input should not be decoded. Your fix should do the trick if you want to go that path.

remove_utf8_flag is indeed scary and another example of an attempt to achieve cancellation of errors. I am pretty sure that TT processing could result in this flag being set, even if the TT results are pure ASCII. Instead of re-evaluating his assumptions, the author just killed the flag to make the string fit his expectations. With current Perl you wouldn't get rid of the flag like that, and Encode::decode will happily decode strings which already have the flag set.

Another alternative with more coding, but better alignment with current practice would be to get rid of $charset_input and expect that the subject and the template parameters are Perl strings. You'd still need TT's ENCODING config because UTF-8 text in files needs decoding, and $charset_output is also still required because MIME::Lite explicitly says that it expects encoded strings.

In reply to Re^3: Lost in encodings by haj
in thread Lost in encodings by Skeeve

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others surveying the Monastery: (5)
    As of 2021-01-24 01:05 GMT
    Find Nodes?
      Voting Booth?