Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^3: splitting a string that appears inconsistently in structure

by fullermd (Priest)
on Jan 02, 2009 at 12:12 UTC ( [id://733744]=note: print w/replies, xml ) Need Help??


in reply to Re^2: splitting a string that appears inconsistently in structure
in thread splitting a string that appears inconsistently in structure

10.16.0.2 - - [19/Jan/2008:03:45:06 -0800] "GGG99994" 200 752 "-" "-" +"10.16.0.2"
(here we have no method, no discernible request, and no version)

Not exactly true. You have a method. It's just a really weird (and probably invalid) one. I'm not sure why your server would 200 it; I can only presume some slightly odd config.

Take it in individual steps. First try splitting out into the 3 main pieces:

my ($method, $uri, $proto, @extra) = split /\s+/, $request; die "Unexpected extra bits in request: @extra" if @extra > 0; die "No method" unless defined $method; # Or whatever other error-handling mechanism you want

You shouldn't have any extra bits, becaue if you do, that means that your $method, $uri, $proto may not hold what you expect them to, so that needs error-checking.

As well, you should have a method. The minimal possible HTTP request AFAIK would be a method of " ", with nothing else. That would leave all the vars undefined, and probably isn't something you care about anyway, so another error there.

The protocol may not be there. But expect that in higher level code, or defined-or it to an empty string here if you prefer.

That leaves the URI. Using URI::Split as suggested above in Re: splitting a string that appears inconsistently in structure would be better than trying to split it up manually. Imagine, for instance, the case of having a '?' in the password; a simple regexp would give you a wrong answer then.

Note that the $uri can be undefined. A request of just "GET " is interpreted as "GET /" (similarly with POST), and would leave $uri undefined after that split. You probably want to make sure it's defined (as an empty string in this case) before you pass it to uri_split(). The URI::Split docs say:

The $path part is always present (but can be the empty string) and is thus never returned as "undef".

So take care not to blow up if it's empty.

Replies are listed 'Best First'.
Re^4: splitting a string that appears inconsistently in structure
by fullermd (Priest) on Jan 02, 2009 at 12:35 UTC
    That leaves the URI.

    For the sake of precision, by the by, it's not really a URI we've got here, it's just the path/query bit of it. But uri_split() does the right thing.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://733744]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-25 16:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found