Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: utf8 in perl

by theravadamonk (Scribe)
on Jul 05, 2018 at 10:40 UTC ( [id://1217941]=note: print w/replies, xml ) Need Help??


in reply to Re: utf8 in perl
in thread utf8 in perl

Hi, Thanks for directing me to good source. Anyway, I wrote a simple perl code. here' my code.

#!/usr/bin/perl use CGI ':standard'; use strict; use warnings; use CGI::Carp 'fatalsToBrowser'; # use only for testing use Encode qw(encode decode); no warnings 'utf8'; print "Content-Type: text/html; charset=utf-8\n\n"; my $subject = "Room Rush \303\242\302\200\302\223 Enjoy 25% off on you +r stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay.) +"; $subject =~ s/[^[:ascii:]]+//g; # get rid of non-ASCII characters my $subject_decoded = decode("MIME-Header", $subject); #my $subject_decoded = decode("MIME-B", $subject); #my $subject_decoded = decode("MIME-Q", $subject); print "\n"; print "<br/>"; print "subject: $subject \n\n"; print "<br/>"; print "subject_decoded: $subject_decoded \n\n";

here's what I get via web browser.

subject: Room Rush Enjoy 25% off on your stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay.) subject_decoded: Room Rush Enjoy 25% off on your stay. (raw: Room Rush – Enjoy 25% off on your stay.)

But, the word " raw: " still appears?

Is this code good? How to enhance it? Anyway, I spent many hours to write the code since I still learn perl

Replies are listed 'Best First'.
Re^3: utf8 in perl
by hippo (Bishop) on Jul 05, 2018 at 14:00 UTC

    I'm not quite sure how you are getting that output. Here's an SSCCE which you might be able to tailor to your requirements. I've removed all the CGI so you can just run this from the command line.

    #!/usr/bin/env perl use strict; use warnings; use Encode qw(encode decode); my $subject = "Room Rush \303\242\302\200\302\223 Enjoy 25% off on you +r stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay.) +"; my $decoded = decode ("MIME-Header", $subject); print encode ("UTF-8", $decoded) . "\n";

    Running this gives:

    Room Rush â Enjoy 25% off on your stay. (raw: Room Rush – Enjoy 25% off on your stay.)
    

    which demonstrates that we have successfully decoded the raw part of the header (and encoded it to UTF-8 for output). HTH.

      thanks a lot for your wonderful code. I can display some subjects with NON ascii stuffs. But I still can NOT display some.

      For e.g - I tried with below 2 subjects. I added below subjects to your code. It will NOT work. I can't think Why? Can You try?

      my $subject = "Last Day To Enjoy Extra 15% OFF On Everything For NDB C +redit Cards (raw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20O +FF=20On=20Everything=20For=20NDB=20Credit=20Cards)";
      my $subject = "Sing Along and Dance with Desmond De Silva at Pegasus R +eef Hotel! (raw: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmon +d=20De=20Silva=20at=20Pegasus=20Reef=20Hotel=21?)";

      I did below exercise too.

      I created a file /tmp/test. contents of /tmp/test

      Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

      I ran below command. it also Will NOT work.

      # cat /tmp/test | perl -MEncode=decode -ne 'print (decode("MIME-Header +", "$_"))' Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (ra +w: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Every +thing=20For=20NDB=20Credit=20Cards) Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (raw +: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva= +20at=20Pegasus=20Reef=20Hotel=21?)

      Any Idea?

        It will NOT work. I can't think Why?

        Because the MIME-encoded parts are not properly terminated. GIGO. They should end with ?=. By changing them to ensure that they are properly terminated they work fine.

        #!/usr/bin/env perl use strict; use warnings; use Encode qw(encode decode); my @subject = ("Room Rush \303\242\302\200\302\223 Enjoy 25% off on yo +ur stay. (raw: Room Rush =?utf-8?b?4oCT?= Enjoy 25% off on your stay. +)", "Last Day To Enjoy Extra 15% OFF On Everything For NDB Credit Cards (r +aw: =?utf-8?Q?Last=20Day=20To=20Enjoy=20Extra=2015%=20OFF=20On=20Ever +ything=20For=20NDB=20Credit=20Cards?=)", "Sing Along and Dance with Desmond De Silva at Pegasus Reef Hotel! (ra +w: =?utf-8?Q?Sing=20Along=20and=20Dance=20with=20Desmond=20De=20Silva +=20at=20Pegasus=20Reef=20Hotel=21?=)"); for my $subj (@subject) { my $decoded = decode ("MIME-Header", $subj); print encode ("UTF-8", $decoded) . "\n"; }
Re^3: utf8 in perl
by Corion (Patriarch) on Jul 05, 2018 at 10:44 UTC

    Maybe I misunderstand you, but raw: is in the original input string too.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1217941]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-03-28 21:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found