comment on

I would go with marto's advice about WWW::Mechanize. I haven't used it yet, but I hear that it is great. I suspect that you will find it easier to use than any advice I could give about decoding the raw HTML to get the next pages to "click" on. You are getting about 5K pages from a huge government website that performs very well. I wouldn't worry too much about fancy error recovery with retries unless you are going to run this program often.

Update:
You can of course parse the HTML content of the search results with regex, but this is a mess...

my (@hrefs) = $mech->content =~ m|COMPLETEHREF=http://www.kultus-bw.de
+/did_abfrage/detail.php\?id=\d+|g;

print "$_\n" foreach @hrefs;    #there are 5081 of these

#these COMPLETEHREF's can be appended to a main url like this:

my $example_url = 'http://www.kultusportal-bw.de/servlet/PB/menu/11884
+27/index.html?COMPLETEHREF=http://www.kultus-bw.de/did_abfrage/detail
+.php?id=04146900';
[download]

Then things get hairy and you will want to whip out some of that HTML parser voo-doo to parse the resulting table. Also, the character codings aren't consistent, for example the page has ä, but not ü which is coded as ü

In reply to Re^3: getting LWP and HTML::TokeParser to run by Marshall
in thread getting started with LWP and HTML::TokeParser by Perlbeginner1

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Syntactic Confectionery Delight
	PerlMonks