in reply to Re^3: getting LWP and HTML::TokeParser to run
in thread getting started with LWP and HTML::TokeParser
hello Marto hello Marshall,
many thanks for the hints. I am going to make some tests with Mechanize! I make use of Mechanize instead of LWP!!
by the way:... i can read on the CPAN:
>br>
Features include:
* All HTTP methods
* High-level hyperlink and HTML form support, without having to parse HTML yourself
* SSL support
* Automatic cookies
* Custom HTTP headers
Mech supports performing a sequence of page fetches including following links and submitting forms. Each fetched page is parsed and its links and forms are extracted. A link or a form can be selected, form fields can be filled and the next page can be fetched. Mech also stores a history of the URLs you've visited, which can be queried and revisited. (end of citation)
Well - Does this mean that id do not have to do Parsing of a result-page with use HTML::TokeParser!? in other words: in the feature-list i can read: "High-level hyperlink and HTML form support, without having to parse HTML yourself" - unbelieveable!!!! Well i cannot believe this! Does this mean that i do not have to parse the fetched HTML-Pages?
Can i get the data-set of each of the 5000 Pages with Mechanize!?
Well i have to make some tests! And perhaps someone can set me straight here!!
BTW: You Marshall are right: this is a "very huge government website that performs very well." I do not think that i run into any troubles....
after the first trials i come back and report all my findings.
untill soon!!
perlbeginner!
many thanks for the hints. I am going to make some tests with Mechanize! I make use of Mechanize instead of LWP!!
by the way:... i can read on the CPAN:
>br>
Features include:
* All HTTP methods
* High-level hyperlink and HTML form support, without having to parse HTML yourself
* SSL support
* Automatic cookies
* Custom HTTP headers
Mech supports performing a sequence of page fetches including following links and submitting forms. Each fetched page is parsed and its links and forms are extracted. A link or a form can be selected, form fields can be filled and the next page can be fetched. Mech also stores a history of the URLs you've visited, which can be queried and revisited. (end of citation)
Well - Does this mean that id do not have to do Parsing of a result-page with use HTML::TokeParser!? in other words: in the feature-list i can read: "High-level hyperlink and HTML form support, without having to parse HTML yourself" - unbelieveable!!!! Well i cannot believe this! Does this mean that i do not have to parse the fetched HTML-Pages?
Can i get the data-set of each of the 5000 Pages with Mechanize!?
Well i have to make some tests! And perhaps someone can set me straight here!!
BTW: You Marshall are right: this is a "very huge government website that performs very well." I do not think that i run into any troubles....
after the first trials i come back and report all my findings.
untill soon!!
perlbeginner!
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^5: getting LWP and HTML::TokeParser to run
by marto (Cardinal) on Oct 10, 2010 at 18:16 UTC | |
Re^5: getting LWP and HTML::TokeParser to run
by Marshall (Canon) on Oct 10, 2010 at 18:51 UTC | |
by Perlbeginner1 (Scribe) on Oct 10, 2010 at 19:33 UTC |
In Section
Seekers of Perl Wisdom