Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

evaling unicode perl source

by gildir (Pilgrim)
on Oct 08, 2001 at 17:53 UTC ( [id://117448]=perlquestion: print w/replies, xml ) Need Help??

gildir has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I want to write a simple emebeded perl processor. No problem here. I'll write a pattern, separate perl code from the rest and use an eval() call on it. And now the problem: the file I evaluate is encoded in unicode. Eighter utf8 or utf16.

How do I evaluate UTF16 perl source? In 'normal' case "print 'foo'" will be encoded as "\0p\0r\0i\0n\0t\0 \0'\0f\0o\0o\0'" and that wont eval because every "\0" character will effectively end a string. Another problem is how to run a pattern on a utf8/16 string.

Recoding the source to any 8-bit charset prior to evaling will not work. Some national characters could be lost during this conversion.

Replies are listed 'Best First'.
Re: evaling unicode perl source
by John M. Dlugosz (Monsignor) on Oct 08, 2001 at 20:20 UTC
    If you are running NT/2000, there is a Win32 API that will do that. It's not present in Win9x, though, and it has problems with its handling of illegal codes, so I have my own C++ function UCS2_to_UTF8 written in assembly language.

    UTF-8 is Perl's native mode. Use "use utf8" before the RE is parsed, and it will work just fine.

    —John
    The Win32 Saint

      -=- MamuT -=-

      Is it same on Unix like solaris, Linux ???

        Is "it" the same? If you mean will Perl swollow UTF-8 and handle UTF-8 sequences as single characters in RE's, then yes.

        Is there a function in the OS to convert USC-2 or UTF-16 into UTF-8? I don't know.

        Will my function work? Only on x86 machines.

        However, the reference implementation in the Unicode book is written in portable C and runs on anything.

        —John

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://117448]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-25 21:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found