Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Re: Re: How to identify invalid reg. expr.?

by mephit (Scribe)
on Jun 05, 2002 at 19:55 UTC ( [id://171974]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: How to identify invalid reg. expr.?
in thread How to identify invalid reg. expr.?

Eep! I hadn't even thought of that. I just ran that in the browser, and got my "invalid regex" warning. After examining that bit, it looks like it *should* dump core, but it doesn't. Maybe something in my system (apache configuration, security configuration, quotas, something like that) is preventing the core from being dumped.

I just ran the script through the debugger, and it turns out that $@ contains the following:

104: if ($@) { DB<2> x $@ 0 '/(?{ dump })/: Eval-group not allowed at runtime, use re \'eval\' +at dbsearch.pl line 103. ' DB<3>
I have no idea what this means. Like I said in my earlier post, I don't know the finer points of using eval, or what's causing it to not dump core. Anyway, how can I make this safer? (I plan to post the entire script for a review one of these years, after I tweak one or two more things, and find a place to host the script.)

--

There are 10 kinds of people -- those that understand binary, and those that don't.

Replies are listed 'Best First'.
Re: Re: Re: Re: How to identify invalid reg. expr.?
by samtregar (Abbot) on Jun 05, 2002 at 20:11 UTC
    Interesting. I didn't know you couldn't eval code in a regex at run-time. Well, even without that I could still hog your CPU by passing it a regex with an exponential solving time.

    As far as what you can do - don't accept a regex from an untrusted user. I don't believe there's any way to fully validate the friendliness of a regex. Maybe you could offer your users a set of pre-canned searchs "full-word search", "phrase search", "starts with", "ends with", etc. Then use the input to build the appropriate regex with \Q$term\E to quarantine the input.

    -sam

      I could still hog your CPU by passing it a regex with an exponential solving time.

      Could alarm be used to mitigate against this?

      Smylers

        Hmm, I'd never heard of alarm before now, so I just threw a little something together to test:
        #!/usr/bin/perl -w BEGIN { #stuff } alarm 10; #the bulk of the program snipped. $_ = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'; print "Match\n<p>" if /a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*[b]/;
        That example is given in the camel book as one that won't finish until after the heat death of the universe. Without the alarm, it does indeed hog the CPU, but with the alarm in there, the script just dies after 10 seconds, which should be plenty long enough for any "valid" request to run. That can be changed, of course. I guess I'd also need to put in a signal handler to do any cleanup when the alarm goes off. Hmm, and I suppose I should also try putting the alarm in the "search_by_regex routine", (since that's the only area I'm concerned about), but possibly look for a way to not "die" if the delay is from a slow database or something other than a wonky regex.

        I also implemented merlyn's eval/regex solution from this node in this thread (or just scroll down, maybe).

        So, is this a viable solution, or are there still problems with it? Thanks.

        --

        There are 10 kinds of people -- those that understand binary, and those that don't.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://171974]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-19 10:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found