http://qs321.pair.com?node_id=652327


in reply to Get non transformed XML

Is there a way to use LWP:Simple to get the source of an XML document without the XSL transformation.
I doubt LWP::Simple has anything to do with XSL translation of some XML document loaded by a webserver.

if I got to the site and hit view source I can see the XML with no problem
My intuition is that the webserver is detecting the browser "agent" string and has determined that your browser (LWP::Simple) can't apply the stylesheet itself, so the webserver is doing the translation server-side for you.

Try using LWP::UserAgent and:

$ua->agent('Mozilla/5.0');

$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/2006120418 Firefox/2.0.0.1');

Update: fixed agent string, thanks Gangabass.

-David

Replies are listed 'Best First'.
Re^2: Get non transformed XML
by Danikar (Novice) on Nov 22, 2007 at 08:57 UTC
    I just tried the code below and recieved the same thing =(
    require LWP::UserAgent;
    
    my $ua = LWP::UserAgent->new;
    $ua->timeout(10);
    $ua->env_proxy;
    $ua->agent('Mozilla/5.0');
    
    my $response = $ua->get('http://www.wowarmory.com/');
    
    if ($response->is_success) 
    {
    	print $response->content;  # or whatever
    }
    else 
    {
    	die $response->status_line;
    }

      I think this not enough.

      Try this UserAgent:

      $ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1 +) Gecko/2006120418 Firefox/2.0.0.1');

      If this not help when try UserAgent which your browser send to target site (you can see it with HTTP::Proxy).

        That worked!

        Thanks a lot.

      Firefox DownThemAll addon retrieves 183 bytes.
      wget retrievies 23k.

      I think it's safe to say it's some sort of header :-)

      Update: fixed in first reply.

      -David