![]() |
|
Come for the quick hacks, stay for the epiphanies. | |
PerlMonks |
comment on |
( #3333=superdoc: print w/replies, xml ) | Need Help?? |
Hi haj And thanks once more for your long and helpful reply. I tested now a bit more, adopting your tip to use decoded_content. So when I look now what is read by LWP I really get the correct Umlaut which I also can see when I set binmode on the debugger's IO. The problem lies in the output of MIME::Lite::TT::HTML it seems. Looking at the code, it seems one can provide input and output charset. When you don't, MIME::Lite::TT::HTML assumes you already provide the correct charset :( So what I would need to do is provide the Charset of the internal perl strings - which doesn't exist I assume. I think I'll have to patch MIME::Lite::TT::HTML… As you wrote: Now when you write the data, you need to encode it to UTF-8. I suppose (but didn't test right now) that MIME::Lite::TT::HTML does the right thing and encodes for you if you provide the Charset attribute on the constructor. =FC is QP-encoding for an ISO-8859-1 'ü' and indeed wrong here. So if you did provide Charset => 'utf8', then shout up, I'll write some tests. So here is my shout out. ;) I assume the relevant part which needs to be patched is this https://metacpan.org/release/MIME-Lite-TT-HTML/source/lib/MIME/Lite/TT/HTML.pm Line 115-117:
Here I would provide "something" for the internal perl encoding. Maybe '*internal*'?. Starting line 156, the code looks dubious. "remove_utf8_flag" does not seem correct. after what I learned from you and others in my threads. And then the from_toencoding should be changed I guess to:
What do you think? Update I've created a patch which allows one to tell MIME::Lite::TT::HTML that text provided ($charset_input) is internal perl representation. With this in place, my script works as expected. Unfortunately it seems the module is abandoned as the issues opened for it are 12 years old :( s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{% +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e In reply to Re^2: Lost in encodings
by Skeeve
|
|