|Problems? Is your data what you think it is?
IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?by mldvx4 (Friar)
|on Sep 04, 2023 at 09:55 UTC
mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
I have a task where I fetch RSS and Atom feeds from different sites. Each URL is querying a different host and there are pushing a thousand hosts, so doing this in parallel would greatly speed up the end result without troubling the remote servers. Going one site at a time, the best way seems to be to have a subroutine to use LWP to deal with fetching each feed, and XML::Feed to process each successful response.
Which direction should I be looking for an efficient way of running a dozen or two such subroutines running concurrently? I want to be able to limit the number of concurrent queries to less than two dozen since larger numbers seem to trigger some kind of outgoing throttling from my ISP. Should I have the main script launch LWP scripts and communicate using IPC or Sockets? Or should I try something like Parallel::ForkManager or similar? Or something else entirely?
Thanks for any tips or advice.