With that many parallel processes all trying for network access, I have encountered some limitation. I think it's not in the host memory or number of processes, but somewhere deeper in the network drivers on the host. You can get a similar result by, for instance, trying to ping multiple hosts in parallel -- above a certain number of hosts, the network response goes horribly sluggish.
For ethernet, complete congestion results in many retries, with each retry picking a random wait time from an ever larger window (see [no such wiki, Exponential_backoff]). So 200 parallel processes would have many ethernet collisions, and some small fraction would end up with the maximum backoff time. At some point normal ssh connections timeout due to lack of activity, and drop.
I had exactly this problem with a little script I wrote years ago, before I knew about Parallel::ForkManager and the like. At the time it didn't matter that I didn't get all of the responses, and it wasn't for any automated system, just my own whims on finding a remote host with certain conditions. (See the doc page for how to limit the number of parallel processes.)
Quantum Mechanics: The dreams stuff is made of