Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Extracting Page Name

by ww (Archbishop)
on Apr 26, 2012 at 23:16 UTC ( #967486=note: print w/replies, xml ) Need Help??

in reply to Extracting Page Name

The something simple may be that the 'pagename' will follow the last slash.

The catch: it may be followed by many options -- a colon, Note_1 if it's followed by a port number; a questionmark for several possible uses; and perhaps others that I'm blanking on just now. But regardless, the entity from the last slash, through a period to the next punctuation should be what you're looking for.

And to broaden the hint a bit further, the regex documentation and tutorials here will show you precisely the way to obtain what you're looking for.

Update: Note_1 See correction (+ + by quester immediately below. Aargh.

Replies are listed 'Best First'.
Re^2: Extracting Page Name
by quester (Vicar) on Apr 27, 2012 at 06:34 UTC

    ... a colon, if it's followed by a port number...

    Minor nit: The colon and port number is just after the hostname in a URL, not the page name. For example, consider the port 8080 in

    The question mark following the page name in a URL starts a list of parameters being passed from the browser to the script running in the server. The parameter values can be more or less anything; by convention spaces will have been replaced by plus signs, but otherwise almost anything goes, including colons. For example,

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://967486]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2022-06-29 04:34 GMT
Find Nodes?
    Voting Booth?
    My most frequent journeys are powered by:

    Results (94 votes). Check out past polls.