Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Reg Ex : an odd error...

by Foggy Bottoms (Monk)
on Aug 19, 2003 at 14:46 UTC ( [id://284915]=perlquestion: print w/replies, xml ) Need Help??

Foggy Bottoms has asked for the wisdom of the Perl Monks concerning the following question:

I'm running the following code and keep on getting an annoying error I've no idea what's causing it... Has anyone seen anything similar before ? Regexes aren't my cup of tea and I reckon I miswrote the one I'm using but as you'll notice in the example below, it works fine for 2 strings before failing...

Here's the code :
for (0..$#bannedF) { return BANNED if ($folder eq $bannedF[$_]); # if folder is dir +ectly equal to a banned folder if ($mode == INC_SUBDIRS) # if a parent folder is banned then so + are the subfolders unless otherwise stated { $folder =~ /^($bannedF[$_]).*/; print "$_ out of $#bannedF - $bannedF[$_] - regexp value : $1 +\n"; return BANNED if ($1); # example : $folder = c:\temp\one; $ba +nnedF[$_] = c:\temp } }

And here's the value of the variables...
@bannedF = ("c:\\win32app\\toolkit", "c:\\winnt", "C:\\WINNT\\system32", "d:\\perl", "d:\\brossad"); my $folder = "c:\TEMP\sdfs";

Here's the output...
0 out of 4 - c:\win32app\toolkit - regexp value : 1 out of 4 - c:\winnt - regexp value : 2 out of 4 - C:\WINNT\system32 - regexp value : Can't find unicode character property definition via main->e or e.pl a +t unicode/Is/e.pl line 0 Press any key to continue . . .

Thanks everyone for your help...

Heureux qui, comme Ulysse, a fait un beau voyage
Ou comme celui-là qui conquit la Toison,
Et puis est retourné plein d'usage et raison,
Vivre entre ses parents le reste de son âge!

J. du Bellay, poète angevin

Edit by tye, change PRE to CODE around long lines

Replies are listed 'Best First'.
Re: Reg Ex : an odd error...
by Abigail-II (Bishop) on Aug 19, 2003 at 15:03 UTC
    @bannedF contains "d:\\perl". Dubbelquoted, so you end up with d:\perl, which you pass to the regexp machine. The regexp machine notes the \p, which is special to it. (Indeed, it has to do with Unicode properties).

    If all you are looking for is an exact match (which I think you are), you're better off using index.

    Abigail

Re: Reg Ex : an odd error...
by bear0053 (Hermit) on Aug 19, 2003 at 15:00 UTC
    Change your code to below and the output becomes:
    0 out of 4 - c:/win32app/toolkit - regexp value : 1 out of 4 - c:/winnt - regexp value : 2 out of 4 - C:/WINNT/system32 - regexp value : 3 out of 4 - d:/perl - regexp value : 4 out of 4 - d:/brossad - regexp value :
    I single qouted your directories and replaced you '\\' with / to avoid the unicode error
    my @bannedF = ('c:/win32app/toolkit', 'c:/winnt', 'C:/WINNT/system32', 'd:/perl', 'd:/brossad'); my $folder = 'c:/TEMP/sdfs'; for (0..$#bannedF) { return BANNED if ($folder eq $bannedF[$_]); # if folder is directly equal to a banned folder if ($mode == INC_SUBDIRS) # if a parent folder is banned then so are the subfolders unle +ss otherwise stated { $folder =~ /^($bannedF[$_]).*/; print "$_ out of $#bannedF - $bannedF[$_] - regexp value : $1 +\n"; return BANNED if ($1); # example : $folder = c:\temp\one; $bannedF[$_] = c:\temp } }
Re: Reg Ex : an odd error...
by ides (Deacon) on Aug 19, 2003 at 15:05 UTC

    I would suggest also cleaning up your code as follows:

    foreach my $banned ( @bannedF ) { return BANNED if ( $folder eq $banned ); if( $mode == INC_SUBDIRS ) { $folder =~ /^($banned).*/; print "$_ out of $#bannedF - $banned - regex value: $1\n"; return BANNED if ($1); } }
    I think it's much easier to read that way.

    -----------------------------------
    Frank Wiles <frank@wiles.org>
    http://frank.wiles.org

Re: Reg Ex : an odd error...
by Not_a_Number (Prior) on Aug 19, 2003 at 20:25 UTC

    If you've got it to work by now, may I point out a couple of possible problems (if you deliberately glided over these issues in the interests of simplicity, I apologise in advance :).

    (NB: I assume, for the sake of argument, that you are prompting a user to enter (a full filepath to) a directory, and then letting them do something with this directory unless it is on the 'banned' list.)

    1. Windoze filenames are case-INsensitive. So as well as banning 'd:/perl' you need to ban 'D:/Perl', etc. This is of course easy to solve.

    2. You generally need to check for the user entering 'd:/perl/' and 'd:/perl\' (or 'd:perl\\' etc) as well as 'd:/perl'. This is easy to solve too.

    3. If there is any chance of you needing to ban the current directory (where the code is being executed) and/or its parent, you should check for the user entering '.' or '..' (best, probably, to put them in your 'banned' list).

    4. Concerning the following lines of your code:

    $folder =~ /^($bannedF[$_]).*/; return BANNED if ($1);

    I'm not sure what it does. If it is simplified to:

    return BANNED if $folder =~ /^($bannedF[$_]).*/;

    the problem becomes more apparent, namely that if you ban for example 'c:/win', you are also banning folders such as 'c:/wings', 'c:/winter_weather' etc. One way to solve this would be to ban anything that starts with 'c:/win/' (having stripped out spurious trailing slashes as under point 2 above).

    There are no doubt other 'security' problems than those listed above, and more checking of user input is certainly required. However, this is how I'd do it if security weren't a vital issue:

    hth

    dave

      Hi N_A_N,

      Thanks for all your ideas ! They've helped me quite a bit and I'm still working on the script...

      Here are a couple answers to your remarks :
      • First of all, I'm not prompting the user for a path : when a change occurs on the drive, a function returns the path where that change occured...
      • Yes I do need to check for capital letters - actually, I noticed when comparing 2 paths that were the same except for the drive letter - in one case I had c:\, in the other C:\. So I rewrote my if statement to if (uc($folder) eq uc($wantedF$_)).
      • Furthermore, as for the "c:/temp" and the "c:\temp" paths, I also rewrote a bit of code that searches for \\ and swaps it for \/ :
        while ($folder =~ /\\/) { $folder =~ s/\\/\//; }
      • You ask whether the current folder can be banned : yes it can but it's irrelevant. Banning in this case means not considering any changes happening in that folder, hence ignoring it. This is not a security issue...
      • As for the regex, yeah I must admit it, I totally missed it - don't know what I was thinking... Your version is the one intended. Thanks !
      Thanks so much for your help and time - it's been quite useful to me...
      Here's my latest code up-to-date cleaned-up and functional...

      sub compareFolder { my $folder = shift; $folder = uc ($folder); while ($folder =~ /\\/) { $folder =~ s/\\/\//; # transform regular c:\sample\path into c:/ +sample/path for later comparison } my $mode = shift; my @wantedF = @{+shift}; my @bannedF = @{+shift}; for (0..$#wantedF) { return ACCEPTED if ($folder eq uc($wantedF[$_])); # if folder is + directly equal to a desired(forced) folder } for (0..$#bannedF) { return BANNED if ($folder eq uc($bannedF[$_])); # if folder is + directly equal to a banned folder if ($mode == INC_SUBDIRS) # if a parent folder is banned then so + are the subfolders unless otherwise stated { my $bFolder = uc($bannedF[$_]); return BANNED if ($folder =~ /(^$bFolder\/).*/); # example : +$folder = c:\temp\one; $bannedF[$_] = c:\temp } } return ACCEPTED; # if execution reaches this point, then folder isn +'t affected by the forced/banned folder list # hence it can be scanned. }

      Thanks for your help and insight...
Re: Reg Ex : an odd error...
by Jasper (Chaplain) on Aug 19, 2003 at 15:07 UTC
    You could look into the use of \Q and \E to avoid interpretation of the special characters in the regexp. That way there'd also be no need to have the double backslashes in the directories.

    Jasper
Re: Reg Ex : an odd error...
by tcf22 (Priest) on Aug 19, 2003 at 15:06 UTC
    Not really sure why this error came up. But if I change
    @bannedF = ("c:\\win32app\\toolkit", "c:\\winnt", "C:\\WINNT\\system32", "d:\\perl", "d:\\brossad");
    to
    @bannedF = qw( c:/win32app/toolkit c:/winnt C:/WINNT/system32 d:/perl +d:/brossad );
    it works.

    Also my $folder = "c:\TEMP\sdfs"; should be my $folder = "c:\\TEMP\\sdfs"; or my $folder = 'c:\TEMP\sdfs';

    Update: Looking at my Perl reference it looks like it is interpreting \p as a property.
Re: Reg Ex : an odd error...
by banduwgs (Beadle) on Aug 19, 2003 at 15:04 UTC
    I don't feel any problem in the given peise of code. This is what I got by running it.
    0 out of 4 - c:\win32app\toolkit - regexp value : 
    1 out of 4 - c:\win32 - regexp value : 
    2 out of 4 - C:\WINNT\system32 - regexp value : 
    3 out of 4 - d:\perl - regexp value : 
    4 out of 4 - d:\brossad - regexp value : 
    
    Tool completed successfully
    
    Error might be due to some other codes - SB.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://284915]
Approved by Mr. Muskrat
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-20 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found