Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Safe string handling

by tdlewis77 (Sexton)
on Aug 26, 2017 at 00:44 UTC ( [id://1198040]=note: print w/replies, xml ) Need Help??


in reply to Re: Safe string handling
in thread Safe string handling

This tool has been evolving over the course of several years. Every time I encounter some weirdness that breaks it, I've enhanced it. I recently rewrote it from scratch to incorporate everything I learned along the way. Offhand I can't tell you that there is a single site that has all the weirdness in my "broken on purpose" example, however, I can tell you that I've encountered websites that have mixed things up in ways that they were never intended. At this point, I think my tool handles everything I've ever encountered and is ready for anything that I haven't yet encountered. Even if you've only encountered well-behaved websites, there still is way to tell Perl to give you the sixth UTF-8 character from a string as in the "$snowman" example.

Replies are listed 'Best First'.
Re^3: Safe string handling
by RonW (Parson) on Aug 28, 2017 at 22:08 UTC

    Can you give us URLs to some example websites?

      Betcha the OP is decoding entity references without first decoding utf-8. That would produce the "mixed" encoding he's claiming to see.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1198040]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-20 00:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found