Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re^2: Problems with unicode properties in regular expressions under chroot (install)

by sgifford (Prior)
on May 10, 2013 at 14:38 UTC ( [id://1032968]=note: print w/replies, xml ) Need Help??

in reply to Re: Problems with unicode properties in regular expressions under chroot (install)
in thread Problems with unicode properties in regular expressions under chroot

Thanks for your thoughts!

The problem with having a separate installation of Perl for every chroot program on your system is that maintainability becomes difficult. In particular, instead of relying on your distribution to let you know when there are Perl-related security updates available, now you need a way to track all of those installations for security updates yourself. In my experience the likelihood of getting that wrong outweighs the security advantages of using chroot to begin with.

At any rate, most chroot programs don't require large installations of software systems and libraries to work. For example, many programs chroot into /var/empty, so they have access to nothing at all. They just make sure to load up everything they need beforehand.

One of the reasons I like to use Perl is that generally I can follow this strategy: load all the resources up front, chroot into a minimal environment, then be confident that my security risks are minimized. This particular program has run that way for several years without any issues.

Really, what I would like to do is find a way to load all of that unicode stuff up front, or else disable it for this program.

  • Comment on Re^2: Problems with unicode properties in regular expressions under chroot (install)

Replies are listed 'Best First'.
Re^3: Problems with unicode properties in regular expressions under chroot (install)
by Anonymous Monk on May 10, 2013 at 23:51 UTC

    Ok, let's see here :) looking through my stuff I find expand unicode property (eg \p{Print}) to regex character class range so this seems to work

    $ perl -Mutf8 -le " utf8->SWASHNEW(q/Print/) ; print for %INC"

    A cleaner (no warnings) version seems to be

    $ perl -le " qr{\P{Print}}; print for %INC; "

    So you might grab perluniprops and qr// up a storm or File::Find and require up a storm

    Anyway you look at it it's all kludges -- there needs to be an official API for this

    preload_unicore (); print for keys %INC; sub preload_unicore { use File::Find::Rule; use Config(); my $privlib = $Config::Config{installprivlib}.'/'; my @files = File::Find::Rule->file->name(qr/\.pl$/)->in( $privlib. +'unicore/' ); tr{\\}{/} for $privlib, @files; s{^\Q$privlib\E}{} for @files; eval { require "$_"} for @files; }
      Thanks! Actually, so far my quick hack is holding up pretty well, the program seems to be running without errors inside its chroot after a few days, so maybe my first take managed to track down all the properties that were needed.

      I thought that maybe the -C0 switch would turn off Unicode altogether, which would be OK for this particular script, but for some reason it still doesn't. In the debugger I tracked it down to some data from the network being encoded as ASCII with Encode, which for some reason set the utf8 flag.

      I also found a module Unicode::Tussle that had some ability to load up all entities, but really it doesn't seem any less kludgey than the ideas you provide above.

      Thanks again!


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1032968]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-21 20:31 GMT
Find Nodes?
    Voting Booth?

    No recent polls found