Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Regular Expression Assistance

by Anonymous Monk
on Jun 10, 2002 at 12:23 UTC ( [id://173090]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

G'day Monks,

I need some help with a regular expression. I need to parse through a scalar value and extract from it the following:
/some/path/to/a/file[some numbers here...don't know how many, but prob +ably at least ten].ext

An example of what I need to extract would be: /home/monks/thanksforhelping1234567890.ext

So, basically, I need to extract the path of a file from this scalar, so I can eventually open a filehandle. I'm pretty new to pattern matching, so I'm having lots of trouble. Can anyone help, please?

Replies are listed 'Best First'.
Re: Regular Expression Assistance
by dda (Friar) on Jun 10, 2002 at 12:41 UTC
    Simple way of extracting directory name:
    use File::Basename; $s = '/home/monks/thanksforhelping1234567890.ext'; print dirname $s;
Re: Regular Expression Assistance
by Bilbo (Pilgrim) on Jun 10, 2002 at 12:42 UTC
    How about this:
    my $root = "/home/monks/thanksforhelping"; my $ext = ".ext"; $string =~ /($root \d+ $ext)/x; my $filename = $1;
    Where $string is the string contaning the filename to be extracted, $root is the known bit of the filename and $ext is the file extension. The regex matches $root followed by one or more digits followed by $ext. The x on the end of the regex means that spaces in the pattern don't do anything (except make it more readable).
Re: Regular Expression Assistance
by moxliukas (Curate) on Jun 10, 2002 at 12:47 UTC

    Well, a quick solution to this could be

    $_ =~ /(\/home\/monks\/thanksforhelping\d+\.ext)/

    However there are some points that I would like to make:

    • This will work only if the scalar contains the path you are searching
    • If there are two or more paths like this i a scalar, this will match only the first one
    • I am not an expert in regexp either, so always take my advice with a grain of salt ;)
      This is what I thought it should look like. Unfortunately, when I tested it, I also got stuff that was on the same line as the path I wanted, but was not part of the path itself. For instance, I got:

      but you might try looking at /home/monks/thanksforhelping1234567890.ext for help

      Any suggestions on how to get rid of the excess stuff?

        I don't really know what is happening at your end, but this seems

        $_ = "but you might try looking at /home/monks/thanksforhelping1234567 +890.ext for help"; $_ =~ /(\/home\/monks\/thanksforhelping\d+\.ext)/; print $1;

        to output:

        /home/monks/thanksforhelping1234567890.ext

        So check for typos... that could be your problem

Re: Regular Expression Assistance
by insensate (Hermit) on Jun 10, 2002 at 12:54 UTC
    Most of the above will only help if the path stays the same throughout the scalar value...this will grab multiple paths...you can then push $pathtofile onto an array etc...
    for($scalar){ /((?:\/\w+)+\/file\d+.ext)/; $pathtofile=$1; }

    -Jason
Re: Regular Expression Assistance
by Anonymous Monk on Jun 10, 2002 at 12:27 UTC
    One more thing I should specify: I know what the directory should be and I know the alphabetic part of the filename...the only part I don't know is the number part. So, in the example I gave, I know that the pattern I want to find has /home/monks/thanksforhelping and I know that it ends in .ext, but I don't know that it contains 1234567890...I only know that it contains some random string of numbers.
      I would do it using File::Basename to separate the path from the filename. This simplifies the task a little: you just have to extract the number from the filename:
      #!/usr/bin/perl -w use strict; use File::Basename; while (<>) { my ($name, $path, $suffix) = fileparse ($_, ".txt"); $name =~ /\D+(\d+)/; print "number: $1\n"; }

      I hope this helps.

      marcos
      If you know the absolute filename and want to open that file, what prevents you from doing just that?

      _Ass_uming safe input data here:

      my $filename = $ARGV[0] or die "No filename given\n"; open INPUT, "<", $filename or die "could not open $filename\n"; while (<INPUT>) { print $_ }; close INPUT;
      Anyway:
      If the complete filename is in $filename you can do:
      my ($file_without_path) = $filename =~ /^.*?\/(\w*?\w*?\.ext)$/;

      and end up with something like: "file23432545335.ext"

      Still, I don't see the point if you trust the filename as being safe, and just want to open it. (Since you will need the path anyway).

      janx

Re: Regular Expression Assistance
by Sifmole (Chaplain) on Jun 10, 2002 at 12:42 UTC
    $path =~ m/thanksforhelping([^.]+)\.ext/; $whatyouwant = $1;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://173090]
Approved by Albannach
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2024-04-23 11:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found