Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Okay, maybe not really a Perl question, per-se... But here goes:

I have a small program (gethttp) to make simple HTTP requests and print the response to stdout. Unfortunately, LWP is not available so it had to be done manually using IO::Socket. I copied the program from one of the Perl man pages and modified it only slightly.

For the most part, this program works just great. Every once in a while, however, I find a page (usually a CGI) that doesn't seem to work. What I get back is a 404 Error though I know the page is there because I can access using my web-browser.

I figure, there must be something going on that I'm just not getting. I can't find any documents anywhere explaining a different syntax for the HTTP GET, and I can't see anything wrong with my Perl code. I really just want to understand why it's not working and what's going on, though it might have a practical application in a project I'm working on if I can get it to work.

I'm including the code from my program below, as well as a URL that it doesn't work on (I don't know if everyone can get to the URL, since it might be set up private to UF. Let me know if you have problems.) and the response I get from them.

I would appreciate any help very much!

The Program...

#!/usr/bin/perl -w use IO::Socket; unless (@ARGV) { die "usage: $0 URL\n" } $EOL = "\015\012"; $BLANK = $EOL x 2; $sep = (@ARGV > 1) ? "-------------------\n" : ""; foreach $url ( @ARGV ) { unless($url =~ m{^http://(.*?)/}) { print "$0: invalid url: $url\n +"; next } $host = $1; $remote = IO::Socket::INET->new( Proto => "tcp", PeerAddr => $host, PeerPort => "http(80)", ); unless ($remote) { die "Cannot connect to http daemon on $host\n" +} $remote->autoflush(1); print $remote "GET $url HTTP/1.0" . $BLANK; while ( <$remote> ) { print } print "\n$sep"; close $remote; }
The Response
$ ./gethttp 'http://login.gatorlink.ufl.edu/authenticate.cgi' HTTP/1.0 404 Not Found Date: Fri, 12 Jan 2001 22:57:21 GMT Server: Apache/1.3.6 (Unix) mod_perl/1.19 mod_ssl/2.2.8 OpenSSL/0.9.2b Connection: close Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>404 Not Found</TITLE> </HEAD><BODY> <H1>Not Found</H1> The requested URL http://login.gatorlink.ufl.edu/authenticate.cgi was +not found on this server.<P> </BODY></HTML>

In reply to HTTP GET without LWP by bbfu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-03-28 09:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found