There's more than one way to do things | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Twenty Questions
Further to salva's reply, try experimenting separately with number of nodes, and length of timeout. You may discover there is a relationship. If there is any variation, try plotting number of nodes and length of timeout to get a successful run. Try looking for problem nodes, by splitting the list into halves, or removing 5 or 10 different nodes each time. You may discover that there are one or two specific nodes that get hung up, but only with a large number of nodes (so it could be network traffic congestion, and poor recovery to/from certain nodes). What happens to process memory when node count goes up? (Perhaps there's a memory leak/retention you aren't expecting.) What happens if you run this from different host nodes? Especially, hosts not on the same end router as the original host? Do you have a different large pool of target nodes, other than the original? How does it perform compared to the original? Is there anything else you can vary? -QM In reply to Re: ssh output is partial when using fork manager
by QM
|
|