Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

regex a url

by andrew (Acolyte)
on Oct 06, 2002 at 02:53 UTC ( [id://203123]=perlquestion: print w/replies, xml ) Need Help??

andrew has asked for the wisdom of the Perl Monks concerning the following question:

all I want of this url
http://www.merchandisemarket.net/shop/s_images/aaa_small.gif

Is the filename, can someone do this for me and make it into a regex, thanks!

-Andrew

Replies are listed 'Best First'.
Re: regex a url
by count0 (Friar) on Oct 06, 2002 at 03:39 UTC
    An alternative directly using a regexp is to use the URI::URL (or even just the URI) module. This module is especially handy when when dealing with and manipulating URLs.
    use URI::URL; my $url = URI::URL->new('http://www.merchandisemarket.net/shop/s_image +s/aaa_small.gif ') my ($filename) = ($url->path_segments)[-1];
Re: regex a url
by joe++ (Friar) on Oct 06, 2002 at 08:02 UTC
    Shamelessly stolen from perldoc URI:

    PARSING URIs WITH REGEXP

    As an alternative to this module, the following (official) regular expression can be used to decode a URI:

    my($scheme, $authority, $path, $query, $fragment) = $uri =~ m|^(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?: +#(.*))?|;

    --
    Cheers, Joe

Re: regex a url
by Abigail-II (Bishop) on Oct 07, 2002 at 11:39 UTC
    You don't understand the concept of URLs. URLs in general DO NOT map to filenames. If there's a relation between a particular URL and a file somewhere on disk, then that's purely a site policy. Without knowing the site policy of www.merchandisemarket.net, you cannot know whether that URL is mapped to a file, and if so, which file.

    Abigail

Re: regex a url
by kelan (Deacon) on Oct 06, 2002 at 15:08 UTC
    You can do this without a regex quite easily. How about this:
    $uri = 'http://www.merchandisemarket.net/shop/s_images/aaa_small.gif'; $i = rindex($uri, '/') + 1; # Starting at the chara +cter one after the last slash, $filename = substr($uri, $i, length($uri)-$i); # get a substring with +the rest of the string

    kelan


    Yak it up with Fullscreen ChatterBox

Re: regex a url
by Aristotle (Chancellor) on Oct 06, 2002 at 12:16 UTC
    print $1 if /\/([^/?]+)(:?\?.*)?$/; But it is better to use joe++'s code and then print $1 if $path =~ /\/([^/]+)$/

    Makeshifts last the longest.

Re: regex a url
by Kage (Scribe) on Oct 06, 2002 at 02:57 UTC
    Well, one way, a this-is-what-I-would-do-because-I-am-lazy way, is
    $bar = "http://www.merchandisemarket.net/shop/s_images/aaa_small.gif"; @foo = split(/\//, $bar); $foocount = @foo; $filename = $foo[$foocount];

    Who's a jigga what?
      No! It isn't.
      my @foo = qw[ a b c d e f ]; my $foo_count = @foo; # $foo_count is 6 my $foo_last = $foo[$foo_count]; # There is no $foo[6] so undef my $real_last = $foo[-1]; # Always works
      A lazier way is:
      (split(/\//, $bar))[-1]

      I like that but I like this even more. There's a general uncertainty about what constitutes the filename in an url so there isn't any 100% way to write this. This is a case where you just have to know your data. Anyhow... to borrow from Kage I'm just grabbing those elements that look filename-ish.

      $bar = "http://www.merchandisemarket.net/shop/s_images/aaa_small.gif"; @foo = grep /\w\.\w{3,4}$/, split m|/|, $bar;

      __SIG__
      printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE

Re: regex a url
by jordanh (Chaplain) on Oct 06, 2002 at 16:39 UTC

    Sure! Here it is in one line:

    my ($filename) = "http://www.merchandisemarket.net/shop/s_images/aaa_small.gif" =~ m{http://www.merchandisemarket.net/shop/s_images/(aaa_small.gif)};

    Hey! It does do what he was asking for!

    Hope this helps!

    :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://203123]
Approved by tadman
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-04-25 11:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found