When you say "incremental updates", does each refresh contain all the preceeding information?
If so, you probably only need the final page, which from your description should be easy to detect because of the presence of summary information.
Presumably the intermediate pages displayed in the browser are fetched as a result of a meta refresh tag or javascript refresh every few minutes? When automated, you wouldn't need the autorefreshes as you are only going to discard them, but it may be necessary to fetch them anyway as the server may decide to cancel the processing if it doesn't see a refresh request at regular intervals.
Depending upon the complexity of the page and the refresh mechanism used, you might get away with using LWP::Simple to get or put the url successively (at appropriately timed intervals), scanning the content returned and discarding it until it contains the summary information.
In more complex cases, you may need to scan the content returned by the first submit and extract the refresh url from embedded javascript. It may even be necessary to rescan every partial content returned page to extract a different url.
It might be easier to use WWW::Mechanize, though I'm not sure that it copes with embedded javascript refreshes?
Providing a code example is pretty much impossible without seeing the pages involved. If the url is public, you could post it, (or /msg it to a willing responder if you don't want to overtax the server), and you might get a worked example.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|