laziness, impatience, and hubris | |
PerlMonks |
LWP Form scrapingby rinceWind (Monsignor) |
on Jan 14, 2003 at 10:54 UTC ( [id://226795]=perlmeditation: print w/replies, xml ) | Need Help?? |
I have been involved in an LWP exercise recently. Rather than trivial link spidering, this involved form filling and POST method. I was struck with the feeling that boiling down the form data is something that has probably been done many times over - but a search didn't find anything obvious. I feel a CPAN module coming on (unless it's already been done and I've missed it), but I'm stuck on which namespace to use: HTML::Formdata, HTTP::Formdata, LWP::HTMLForm, thoughts please.
From the existing code that I have written, is a sub formdata, which takes the HTML page and form name as parameters. The form name is optional; if no form name is specified, the routine picks up the first form on the page.
Has anything like this been done before? Please let me know if I am duplicating effort here. The sub returns a list of key/value pairs. Thinking about it, I realised that if the calling code turns it into a hash, this could lose any duplicate keys. At this point, the light of recognition came on in my mind. This was a very familiar concept, that of a CGI object. I could make formdata return a CGI object or something inheriting from CGI, giving access to all the input fields via $form->param. Besides being capable of being submitted via a normal POST of encoding type application/x-www-form-urlencoded, I would also like the code to be able to handle file uploads and encoding type multipart/form-data.
Back to
Meditations
|
|