http://qs321.pair.com?node_id=516845

weilies has asked for the wisdom of the Perl Monks concerning the following question:

hi, i had a situation here. Is it possible for me to get a return of url by a perl command?
E.g.
Pg A(currently)->Pg B->Pg C->....->Pg D '->' means 'redirect' Is there a perl function can retrieve the URL of Pg D from Pg A?
Like i m currently in Pg A, then inside the a.pl, i write Code : print a_perl_func(Pg_B);
Browser Output : http://www.domain.com/d.pl
Meanwhile i m still standing in a.pl
Can perl do that? What i hope to get is the URL of the final landing pg, no matter how much pg redirected. But it sounds crazy for me. :~o
Thanks
  • Comment on Is it possible to get the redirected URL?

Replies are listed 'Best First'.
Re: Is it possible to get the redirected URL?
by Tanktalus (Canon) on Dec 15, 2005 at 05:06 UTC

    It's a bit unclear to me what exactly you're trying to do. I'm going to fathom a guess, including some hand-waving, and you'll have to tell me if I'm on the right track.

    You have a CGI script running on a machine named "www.domain.com". The CGI script is "a.pl". Currently, "a.pl" returns a redirect to Pg B. After that, it doesn't currently know anything.

    Meanwhile, Pg B is really "b.pl", which issues a redirect to c.pl, which issues a redirect to d.pl.

    What you want to do is figure out where b.pl is really going to go, and just issue the redirect directly to the last place.

    If this is a correct guess on what you're looking for, my humble suggestion is "don't." As in, don't short-circuit this. It's incredibly error-prone in the general case, and mildly annoying in specific cases where there is a general-purpose tool available to do it for you (i.e., the browser).

    There are a number of ways to be redirected. The most obvious is a "page moved" response. There's also meta-redirect tags, or even Javascript. Do you want/need to handle all of these?

    Further, Pg B (or Pg C) may or may not be a script (you can't always tell - just because it ends with ".html" doesn't mean it's not dynamic). And they may do different redirections based on cookies the user has ("not authorised - go away"). Or they may do other processing ("User clicked on 'pg b' in a form that needs some other processing - send them to pg c so that it can be processed, then pg c figures out that there's stuff missing, so sends the user to an error page, page d"). There is a lot of context that you need to be aware of. Allowing the browser to do it means that you can really simplify the code and ensure everything is correct easier.

    That said, if all of this really lived inside the same set of perl modules, you could simply call the other pages directly and let them figure out what to do. This would be relatively simple in CGI::Application, as it would, I'm sure, under a number of other systems. Just call the next function directly - you save a lot of processing (both the client and the server) and associated network overhead this way. But the key here is to call the worker functions that are identical whether you're redirecting or just calling the next function in the series (except for a decision point on whether to redirect or call the next function). Again, this can simplify your workload as the developer while still saving that extra overhead for the server and the user.

      First of all i would like to thanks for u guys givin me advises. Sorry if i dint make my problem clear there. :)
      Actually beside Pg A, all is not my control anymore, they are code reside in other domain, so, what i need to do, is making a bot to send out a request (url), like
      get_final_landing_pg(a_url)
      Then, their system will based on my 'a_url' to do some processing, then they may have different redirect page based on the validity of my 'a_url',
      let me make some explaination for a_url, actually it is a url with some id pass to their sys. Then their sys will determine what to perform based on my a_url's id, their sys will redirect to 'b_url' if my id got no problem, and other then 'b_url' will be 'not_b_url' shows 'Expired'
      then, i can perform some comparison coz my sys already know the valid url 'b_url'(predefined in my system), coz i m still taking control in my a.pl.
      Eg.
      if (my_predefined_url eq get_final_landing_pg(a_url)) { return "not expired" } else { return "expired" }
      There wont be any user interruption for the whole process like "click A" for redirect to pg_A or
      "click B" for redirect to pg_B

      Thanks for you guys spending time in reading my question. My broken english really brings up lots of trouble to you guys
      :p
Re: Is it possible to get the redirected URL?
by Kanji (Parson) on Dec 15, 2005 at 05:21 UTC

    LWP (or WWW::Mechanize) would let you load Page B and automatically follow the HTTP redirects, with the final URL being available to you in the uri method.

    Where things get tricky is if any of the pages handle redirects via meta refresh or JavaScript.

    The former is pretty easy to code around, but the latter usually requires prior knowledge of what the JavaScript is/does so that you can explictly tailor your code to handle it (and then hope no one changes it).

    If you do expect a lot of JavaScript redirects, a better alternative might be to use either Win32::IE::Mechanize or Mozilla::Mechanize, as both browsers have JavaScript engines and should be able to follow those redirects automatically (but I've used neither module, so can't confirm this).

        --k.


Re:Is it possible to get the redirected URL?
by kulls (Hermit) on Dec 15, 2005 at 04:27 UTC
    I'm not sure,
    but you can keep track all the visited page from HTTP_REFERER and maintain in the session.finally you can retrieve , whenever you want.
    -kulls