more useful options | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
This thread is getting long, and I have a couple screenfuls of comments, output, and source, which I'll put between readmore tags to save scrollfingers...
This is a pretty straightforward way to deal with Unicode and UTF-8.
With respect, I don't think the solution you give is gonna do the trick. Let me add that I'm no windows guru but find myself unable to replicate your result on the platform OP has. The reason that I looked at this thread is that I'm breaking in my new windows laptop with strawberry perl, and I wanted to see if I could do the basic things that OP seeks. In my experience, if it's working up Russian, there is a marsh of mojibake before results obtain. Obviously, You need to save your source code UTF-8 encoded.Is this a thing? To my understanding, it is the opinion of the software which opens the file as to what its encoding is. On the properties for the script I post here is no such option. This is output from a version of the script that shows the data in different formats. I'm gonna try pre tags here: C:\Users\tblaz\Documents\evelyn>perl 2.cyr.pl ------- { name => "\x{411}\x{418}\x{411}\x{41B}\x{418}\x{41E}\x{422}\x{415}\x{41A}\x{410}\x{420}\x{42C}", recentactions => 38, userid => 1686692, }, { name => "\x{411}\x{430}\x{431}\x{43A}\x{438}\x{43D}\x{44A} \x{41C}\x{438}\x{445}\x{430}\x{438}\x{43B}\x{44A}", recentactions => 144, userid => 2208294, }, { name => "\x{411}\x{430}\x{434}\x{43C}\x{430} \x{425}\x{430}\x{440}\x{43B}\x{443}\x{435}\x{432}\x{430}", recentactions => 4, userid => 2587115, }, ------- $VAR1 = { 'recentactions' => 38, 'userid' => 1686692, 'name' => "\x{411}\x{418}\x{411}\x{41b}\x{418}\x{41e}\x{422}\x{415}\x{41a}\x{410}\x{420}\x{42c}" }, { 'name' => "\x{411}\x{430}\x{431}\x{43a}\x{438}\x{43d}\x{44a} \x{41c}\x{438}\x{445}\x{430}\x{438}\x{43b}\x{44a}", 'recentactions' => 144, 'userid' => 2208294 }, { 'name' => "\x{411}\x{430}\x{434}\x{43c}\x{430} \x{425}\x{430}\x{440}\x{43b}\x{443}\x{435}\x{432}\x{430}", 'recentactions' => 4, 'userid' => 2587115 } ; ------- Content-Type: text/html; charset=utf-8 <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Мой тест</title> </head> <body> Бабкинъ Михаилъ </body> </html> Source that produced this: You must check whether the JSON data might, in some circumstances, contain characters which have a special meaning in HTML, in particular < and &. I have tried this script both with and without your changes to the html display, yet his test does not render. Meanwhile, I can read it fine in Notepad and Notepad++. Telling for me is when I asked for a listing on STDOUT. I'll try this abbreviated and with code tags: EditShowing source listing from haj's subroutine /Edit
What I see is that "My test" does not even render here. To my eye, he has all of the russian on the hook with his data queries; it's just not getting represented correctly on the terminal that Strawberry Perl gives you. His install might be as fresh out of the box as mine. To illustrate what I think is going on, I created a smaller script:
Source listing:
This one line might best be represented with a p tag: say "Привет"; Anyways, it seems like there's some wonky IO layer going on here... Thanks all for interesting comments, In reply to Re^2: Proper Unicode handling in Perl
by Aldebaran
|
|