Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Regex Improvement

by Lexicon (Chaplain)
on Jan 19, 2001 at 12:34 UTC ( [id://52953]=perlquestion: print w/replies, xml ) Need Help??

Lexicon has asked for the wisdom of the Perl Monks concerning the following question:

This RegEx is meant to pull the working subdirectory out of a user passed variable, and then do some stuff with it later on. This code works just fine as it is in my testing, but I'd like to know if the RegEx is reasonably robust or if it's gonna break for something stupid. The code will be run exclusively in Win32, but it will be a Japanese OS. Thanks!

$HomePath = $ENV{"WORKHOME"}; $Model = shift; $Model =~ m/\Q$HomePath\E\\(.*)\\[.\w]+/i; $WorkingDirectory = $1;
And for those even newer at this than I:
m/ \Q # Match exact string till \E $HomePath # Start with path from Environment variable \E # End Quote \\ # Backslash (.*) # Working Directory: remember for later \\ # Backslash [.\w]+ # A filename (letters and period) /i # Case insensitive

-Lexicon

Replies are listed 'Best First'.
Re (tilly) 1: Regex Improvement
by tilly (Archbishop) on Jan 19, 2001 at 17:16 UTC
    Do you trust your user? If not then this article will give you some ideas about what can go wrong.

    Also as I0 alluded to, the period in the character class is a little more...generous...than you think.

    Perl on Win32 will accept "/" as a valid path delimiter.

    You may or may not wish to pass this freedom along to your users.

    Did you want to complicate the possibility that the current working directory is the home directory? If not then throw a ? after the first path delimiter.

    Likewise it is generally a bad idea to assume you had a match and then accept $1. Test whether the match succeeded. Else mistakes will turn into the assumption that the home directory is the working directory, or else the output of a previous match will be accepted instead.

Re: Regex Improvement
by knight (Friar) on Jan 19, 2001 at 18:23 UTC
    Since you used [.\w]+ to try to match the last file name component, the match will fail if the user supplied a string ending with the directory delimiter (\home\me\foo\bar\).

    This takes you beyond just fixing your regex, but if you use the File::Spec module, you can accomodate end-cases like this and get portability to boot. For your example, this might look like:
    $pat = File::Spec->catfile($HomePath, ''); if ($Model =~ m/^\Q$pat\E/i) { (undef,$WorkingDirectory,undef) = File::Spec->splitpath($'); } else { # whatever to do if the supplied variable isn't under $HomePath }
    The initial File::Spec->catfile call appends a trailing '\' so that it doesn't show up as an initial '\' in the resulting $' that you pass to File::Spec->splitpath.
Re: Regex Improvement
by dws (Chancellor) on Jan 19, 2001 at 12:41 UTC
    Nice of you to provide a commented version of regexp. Readers new to such a format should note that it requires a /x modifier.
Re: Regex Improvement
by I0 (Priest) on Jan 19, 2001 at 15:36 UTC
    Is the [.\w]+ necessary?
    Can your filename ever contain any other characters?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://52953]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2024-03-29 10:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found