http://qs321.pair.com?node_id=798571

Mak3r has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to determine the user-agent from the UAS string (from an Apache header) with regular expressions kept in a MySQL database; if it doesn't match, it gets a crappy default. So I'm seeing a lot of mis-matched UAs with the crappy default, but when I try the re from the database against the mis-matched UA in a noddy script containing just the re and UAS (but which isn't a whole lot different to the real code, except for the fact that my noddy script uses a Perl literal string for UAS), it matches. In "real life", with the UAS extracted from the header record, it (sometimes) fails. Has anyone seen this before? And why does it happen?

Replies are listed 'Best First'.
Re: Regular Expressions not matching
by ELISHEVA (Prior) on Oct 01, 2009 at 10:34 UTC

    If your noddy (a.k.a. "play") script works and the real script does not, then your noddy script isn't doing the same thing as your real program. In software there is an ocean of difference between "isn't much different" and "isn't different". When it comes to failing to find bugs, most of the reasons are swimming in that ocean.

    The usual strategy for tracking down a bug like this is to start with the real code and strip it down bit by bit until you have nothing but the bare essentials but can reproduce the problem. In the process, you may find the reason for your bug, which is great. If not, you will have a nice focused example that you can post here. Without code that reproduces the bug, the most anyone can give you is general observations about what sometimes trips some people up when they work with regular expressions. That probably won't be very useful to you.

    As you strip down your code one thing you might want to pay special attention to is failing regexen in previously executed lines. If a regex fails the variables normally set by the regex will retain their previous values. They won't be set to undefined. That sometimes trips people up. From perlre:

    NOTE: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match.

    GrandFather has some excellent tips for stripping down your code in his post I know what I mean. Why don't you?.

    Best, beth

      Indeed. However, your reply did jolt me into testing my code with an integration test (passing a UA string and request through a GET) rather than a pure unit test. That would allay my fears. Cheers.
Re: Regular Expressions not matching
by ccn (Vicar) on Oct 01, 2009 at 09:45 UTC