Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Regexp to convert high-bit (?) characters to character entites

by hadleyw (Initiate)
on Jul 10, 2002 at 04:34 UTC ( [id://180672]=perlquestion: print w/replies, xml ) Need Help??

hadleyw has asked for the wisdom of the Perl Monks concerning the following question:

I need a regexp that will convert high-bit characters (eg. > \0377 ?) to the appropriate character entity (eg. &x123;). I'm having a bit of mental blank and wondered if someone could give me some ideas to get started.

Thanks

Hadley

  • Comment on Regexp to convert high-bit (?) characters to character entites

Replies are listed 'Best First'.
Re: Regexp to convert high-bit (?) characters to character entites
by epoptai (Curate) on Jul 10, 2002 at 04:51 UTC
Re: Regexp to convert high-bit (?) characters to character entites
by Juerd (Abbot) on Jul 10, 2002 at 06:48 UTC

    I need a regexp that will convert high-bit characters (eg. > \0377 ?) to the appropriate character entity (eg. &x123;).

    No, you do not need a regexp that will do that. There's a very nice module that does HTML entities: HTML::Entities.

    When I'm in a hurry, I often use s/(\W)/'&#' . ord($1) . ';'/g for dumping data, because it's so easy to convert it back to the original, and encoding printable \W characters doesn't hurt.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

Re: Regexp to convert high-bit (?) characters to character entites
by dakkar (Hermit) on Jul 10, 2002 at 16:12 UTC

    I have written a little script that does something similar. I started from HTML::Entities, but since I use UTF-8 for storing my documents, and that module supposes ISO-8859-1, it didn't work.

    So I converted the hash from the module (chars to entity names) in UTF-8 using iconv, added a switch, and it now works. You can find it served from my PC (it's a DynDNS name, so sometimes may be off-line)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://180672]
Approved by epoptai
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (None)
    As of 2024-04-18 23:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found