Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Control Characters (\xNN) in HTML

by blackmateria (Chaplain)
on Oct 18, 2001 at 20:52 UTC ( [id://119735]=note: print w/replies, xml ) Need Help??


in reply to Control Characters (\xNN) in HTML

Like tommyw and scain said, you have to escape the backslash. Also, I don't think \d matches hex digits 'A'-'F'. If you need to match those (looks like you do from the sprintf), you can use [[:xdigit:]] instead.
s/\\x([[:xdigit:]]+)/'&#'.hex($1).';'/eg
Btw, if you only want 2 digits (so that stuff like "\x92Efficiency" doesn't confuse the regex), use {2} instead of +. If you want to match one or two, use {1,2}. You're probably better off matching an exact number rather than a range though if you can.
s/\\x([[:xdigit:]]{2})/'&#'.hex($1).';'/eg
Hope this helps!

Update: Oops, I just read your reply to tommyw above. All you need is a range, combined with ord (not hex).

s/([\x80-\xFF])/'&#'.ord($1).';'/eg

Replies are listed 'Best First'.
Re: Re: Control Characters (\xNN) in HTML
by tommyw (Hermit) on Oct 18, 2001 at 22:24 UTC

    Pah! Updating your answer based on a reply to my message. Are there no depths to the plagarism people will stoop to? :-)

    In an attempt to retaliate, allow me to offer:

    s/([^[:print:]])/'&#'.ord($1).';'/eg
    in return.

Re: Re: Control Characters (\xNN) in HTML
by garliqua (Novice) on Oct 21, 2001 at 02:51 UTC

    Thanks to everyone who replied. I appreciate it.

    I ended up going with blackmateria's solution:

    s/([\x80-\xFF])/'&#'.ord($1).';'/eg

    Since it kept the scope of the substitutions neatly to just the things I wanted to replace (as opposed to tommyw's followup using [:print:], which—as I understand it from page 80 of the owl book, at least—would have performed replacements on tabs and other non-space-character whitespace as well).

    Thanks again to all y'all!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://119735]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 15:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found