Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Problems with unicode properties in regular expressions under chroot (install)

by Anonymous Monk
on May 10, 2013 at 06:59 UTC ( #1032902=note: print w/replies, xml ) Need Help??

in reply to Problems with unicode properties in regular expressions under chroot

So my question is, is there a way to preload all unicode properties so that I don't have to worry about this?

Probably, but I wouldn't look to figure out what it is, I would install perl/modules/everything-you-need under the chroot jail, so it works like regular perl. See links for chroot setup

Or at least a way to get a clean failure when it can't find one of these properties, instead of mysterious misbehavior?

Well, AFAIK, even the buggy perl-5.10.x ought to give an error when a particular needed unicore file is missing, so you could try upgrading? Or writing a minimal testcase and submitting it using perlbug?

  • Comment on Re: Problems with unicode properties in regular expressions under chroot (install)

Replies are listed 'Best First'.
Re^2: Problems with unicode properties in regular expressions under chroot (install)
by sgifford (Prior) on May 10, 2013 at 14:38 UTC

    Thanks for your thoughts!

    The problem with having a separate installation of Perl for every chroot program on your system is that maintainability becomes difficult. In particular, instead of relying on your distribution to let you know when there are Perl-related security updates available, now you need a way to track all of those installations for security updates yourself. In my experience the likelihood of getting that wrong outweighs the security advantages of using chroot to begin with.

    At any rate, most chroot programs don't require large installations of software systems and libraries to work. For example, many programs chroot into /var/empty, so they have access to nothing at all. They just make sure to load up everything they need beforehand.

    One of the reasons I like to use Perl is that generally I can follow this strategy: load all the resources up front, chroot into a minimal environment, then be confident that my security risks are minimized. This particular program has run that way for several years without any issues.

    Really, what I would like to do is find a way to load all of that unicode stuff up front, or else disable it for this program.

      Ok, let's see here :) looking through my stuff I find expand unicode property (eg \p{Print}) to regex character class range so this seems to work

      $ perl -Mutf8 -le " utf8->SWASHNEW(q/Print/) ; print for %INC"

      A cleaner (no warnings) version seems to be

      $ perl -le " qr{\P{Print}}; print for %INC; "

      So you might grab perluniprops and qr// up a storm or File::Find and require up a storm

      Anyway you look at it it's all kludges -- there needs to be an official API for this

      preload_unicore (); print for keys %INC; sub preload_unicore { use File::Find::Rule; use Config(); my $privlib = $Config::Config{installprivlib}.'/'; my @files = File::Find::Rule->file->name(qr/\.pl$/)->in( $privlib. +'unicore/' ); tr{\\}{/} for $privlib, @files; s{^\Q$privlib\E}{} for @files; eval { require "$_"} for @files; }
        Thanks! Actually, so far my quick hack is holding up pretty well, the program seems to be running without errors inside its chroot after a few days, so maybe my first take managed to track down all the properties that were needed.

        I thought that maybe the -C0 switch would turn off Unicode altogether, which would be OK for this particular script, but for some reason it still doesn't. In the debugger I tracked it down to some data from the network being encoded as ASCII with Encode, which for some reason set the utf8 flag.

        I also found a module Unicode::Tussle that had some ability to load up all entities, but really it doesn't seem any less kludgey than the ideas you provide above.

        Thanks again!


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1032902]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2023-05-28 05:49 GMT
Find Nodes?
    Voting Booth?

    No recent polls found