A better way? Extracting filename from url

Gerard has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: A better way? Extracting filename from url by gav^ (Curate) on Jan 14, 2002 at 19:38 UTC
What's wrong with using the nice URI module? `use URI; while (<DATA>) { my $file = (URI->new($_)->path_segments)[-1]; print $file, "\n"; } __DATA__ http://www.spied.co.nz/thinmailer.cgi http://www.site.com/dir/file.html?param1=val1&param2=val2` [download] gav^	[reply] [d/l]
Re: Re: A better way? Extracting filename from url by theorbtwo (Prior) on Jan 14, 2002 at 21:01 UTC
This sounds like the best way to me. Don't forget to uri_unescape() (or whatever it's called; I don't quite recall) the filename part to deal with stuff like "file%20name.htm". Thanks, James Mastros, Just Another Perl Scribe	[reply]
Re: A better way? Extracting filename from url by Parham (Friar) on Jan 14, 2002 at 16:28 UTC
i use these with local harddrive filenames, but they should work pretty much the same for url's method 1: `$file = substr $url, rindex($url, '/') + 1; #path contains full url` method 2: `$url =~ s/^.*\///; #just use the url to get the filename with a regex` method 3: `($file) = $url =~ m!([^/]+)$!; #another regex way`	[reply] [d/l] [select]
Re: Re: A better way? Extracting filename from url by moodster (Hermit) on Jan 14, 2002 at 17:01 UTC
It works, but I guess you've also got to consider the possibility that the URL will contain query parameters, like this: `http://www.site.com/dir/file.html?param1=val1&param2=val2` So a second substitution (I'm sure you can do it with a lookahead assertion or similar, but that seems a bit overkill) is probably in order: `$file =~ s/\?.*//;` Cheers, -- moodster	[reply] [d/l] [select]
Re: Re: Re: A better way? Extracting filename from url by Gerard (Pilgrim) on Jan 15, 2002 at 02:23 UTC
Moodster has a good point. Thanks all for your comments. To be honest I had not even considered the possiblity of query paramters, which is very silly of me. But hey, it was late. Anway, I can now look at this again a bit later on. I am constantly amazed and impressed with the good nature and high value of comments that come out of this site. Regards, Gerard The caffeine addict (now sufficiently supplied).	[reply]
Re: A better way? Extracting filename from url by arhuman (Vicar) on Jan 14, 2002 at 16:25 UTC
Did you try basename ? It's not the usual way (regexes or URI::... modules) but it seems to work. (Need some testing although...) "Only Bad Coders Code Badly In Perl" (OBC2BIP)	[reply]
Re: A better way? Extracting filename from url by flocto (Pilgrim) on Jan 14, 2002 at 18:28 UTC
An easy way to do this is this simple regexp: `my ($file) = $url =~ /\/([^\/]*?)(?:\?\|$)/` This will work with host:port, directories and parameters given in the url. Hope this works for you.. -octo- -- GED/CC d-- s:- a--- C++(+++) UL+++ P++++$ L++>++++ E--- W+++@ N o? K? w-- O- M-(+) V? !PS !PE !Y PGP+(++) t-- 5 X+ R+(+++) tv+(++) b++@ DI+() D+ G++ e->+++ h!++ r+(++) y+	[reply] [d/l]
Re: A better way? Extracting filename from url by Caillte (Friar) on Jan 14, 2002 at 18:03 UTC
One way would be : `$subject="http://www.spied.co.nz/thinmailer.cgi"; $subject=~ s/.\/([^\/])$/$1/; print $subject` [download] What this does is return the last block of code that does not contain a backslash, from the last backslash to to the end of line. This page is intentionally left justified.	[reply] [d/l]
Re: A better way? Extracting filename from url by ropey (Hermit) on Jan 14, 2002 at 19:10 UTC
my $url = 'http://www.test.com/test/fragr/asas.htm'; $url =~ m/.\/(\D?\.\D*?$)/; should do it	[reply]
Re: A better way? Extracting filename from url by archen (Pilgrim) on Jan 14, 2002 at 19:38 UTC
I've made a few programs which deal with filenames of URL's and I take the same approach as you, although I use a slightly different method. To me it's more important that you use what you're comfortable with since you're the one that will probably be reading the code later. `$subject="http://www.spied.co.nz/thinmailer.cgi"; $filename = (split(/\//, $subject))[-1]; print $filename;` [download]	[reply] [d/l]


"be consistent"
	PerlMonks