Re: Get non transformed XML

by erroneousBollock (Curate)
on Nov 22, 2007 at 08:49 UTC ( #652327=note: print w/replies, xml ) Need Help??

in reply to Get non transformed XML

Is there a way to use LWP:Simple to get the source of an XML document without the XSL transformation.
I doubt LWP::Simple has anything to do with XSL translation of some XML document loaded by a webserver.

if I got to the site and hit view source I can see the XML with no problem
My intuition is that the webserver is detecting the browser "agent" string and has determined that your browser (LWP::Simple) can't apply the stylesheet itself, so the webserver is doing the translation server-side for you.

Try using LWP::UserAgent and:


$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2006120418 Firefox/');

Update: fixed agent string, thanks Gangabass.


Replies are listed 'Best First'.
Re^2: Get non transformed XML
by Danikar (Novice) on Nov 22, 2007 at 08:57 UTC
    I just tried the code below and recieved the same thing =(
    require LWP::UserAgent;
    my $ua = LWP::UserAgent->new;
    my $response = $ua->get('');
    if ($response->is_success) 
    	print $response->content;  # or whatever
    	die $response->status_line;

      I think this not enough.

      Try this UserAgent:

      $ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: +) Gecko/2006120418 Firefox/');

      If this not help when try UserAgent which your browser send to target site (you can see it with HTTP::Proxy).

        That worked!

        Thanks a lot.

      Firefox DownThemAll addon retrieves 183 bytes.
      wget retrievies 23k.

      I think it's safe to say it's some sort of header :-)

      Update: fixed in first reply.


