Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^3: utf8 in perl

by hippo (Bishop)
on Jul 05, 2018 at 14:00 UTC ( [id://1217951]=note: print w/replies, xml ) Need Help??


in reply to Re^2: utf8 in perl
in thread utf8 in perl

I'm not quite sure how you are getting that output. Here's an SSCCE which you might be able to tailor to your requirements. I've removed all the CGI so you can just run this from the command line.

#!/usr/bin/env perl use strict; use warnings; use Encode qw(encode decode); my $subject = "Room Rush \303\242\302\200\302\223 Enjoy 25% off on you +r stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay.) +"; my $decoded = decode ("MIME-Header", $subject); print encode ("UTF-8", $decoded) . "\n";

Running this gives:

Room Rush â Enjoy 25% off on your stay. (raw: Room Rush – Enjoy 25% off on your stay.)

which demonstrates that we have successfully decoded the raw part of the header (and encoded it to UTF-8 for output). HTH.

Replies are listed 'Best First'.
Re^4: utf8 in perl
by theravadamonk (Scribe) on Jul 05, 2018 at 16:57 UTC

    thanks a lot for your wonderful code. I can display some subjects with NON ascii stuffs. But I still can NOT display some.

    For e.g - I tried with below 2 subjects. I added below subjects to your code. It will NOT work. I can't think Why? Can You try?

    my $subject = "Last Day To Enjoy Extra 15% OFF On Everything For NDB C +redit Cards (raw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20O +FF=20On=20Everything=20For=20NDB=20Credit=20Cards)";
    my $subject = "Sing Along and Dance with Desmond De Silva at Pegasus R +eef Hotel! (raw: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmon +d=20De=20Silva=20at=20Pegasus=20Reef=20Hotel=21?)";

    I did below exercise too.

    I created a file /tmp/test. contents of /tmp/test

    Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

    I ran below command. it also Will NOT work.

    # cat /tmp/test | perl -MEncode=decode -ne 'print (decode("MIME-Header +", "$_"))' Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

    Any Idea?

      It will NOT work. I can't think Why?

      Because the MIME-encoded parts are not properly terminated. GIGO. They should end with ?=. By changing them to ensure that they are properly terminated they work fine.

      #!/usr/bin/env perl use strict; use warnings; use Encode qw(encode decode); my @subject = ("Room Rush \303\242\302\200\302\223 Enjoy 25% off on yo +ur stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay. +)", "Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (r +aw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Ever +ything=20For=20NDB=20Credit=20Cards?=)", "Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (ra +w: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva +=20at=20Pegasus=20Reef=20Hotel=21?=)"); for my $subj (@subject) { my $decoded = decode ("MIME-Header", $subj); print encode ("UTF-8", $decoded) . "\n"; }

        2 days ago, I got an email with below subject

        my $subject = "=?GB18030?B?XXXXXXXXXX?=";

        then, When I ran the code, I got below error

        Unknown encoding "GB18030" at /usr/lib64/perl5/Encode.pm line 174

        I searched it. Then I came to know that GB 18030 is a Chinese government standard.

        I found below 2 Urls

        https://perldoc.perl.org/Encode/CN.html

        https://stackoverflow.com/questions/6105316/how-to-convert-from-gbk-encoding-to-utf-8-encoding-in-perl

        they talk about Encode::HanExtra

        So, Added below line to my code. then, It started working.

        use Encode::HanExtra;

        So, Here's my UPDATED code. I think it's worth sharing...

        #!/usr/bin/perl use strict; use warnings; use Encode qw(encode decode); use Encode::HanExtra; no warnings 'utf8'; my @subject = ("Room Rush \303\242\302\200\302\223 Enjoy 25% off on yo +ur stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay. +)", "Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (r +aw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Ever +ything=20For=20NDB=20Credit=20Cards?=)", "Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (ra +w: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva +=20at=20Pegasus=20Reef=20Hotel=21?=)", "RE: Weekly Report (CommercialLegal) -\303\202\302\240 2 July 2018 to +5 July 2018 (raw: =?UTF-8?Q?RE:_Weekly_Report_=28CommercialLeg?=\t=?U +TF-8?Q?al=29_-=C2=A0_2_July_2018_to_5_July_201?=\t=)", "How are you", "=?GB18030?B?XXXXXXXXXX?="); for my $subj (@subject) { my $subject_decoded = decode("MIME-Header", $subj); print $subject_decoded; print "\n"; }

        Have a nice day to all Perl Monks

        Thanks for enlightening me. Since I was busy, I had to take much time to reply you. Sorry for it. Thanks everyone for their wonderful efforts.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1217951]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-26 01:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found