You can go a long way with a “simple poll() loop” architecture. The app listens simultaneously to one-or-more sockets and responds appropriately when activity occurs on each socket ... but the activity is not actually “concurrent.” Instead, the application is doing exactly one thing at a time, but in an unpredictable sequence, and it all works just fine because I/O, including network I/O, is much slower than the CPU. (CPUs think in terms of nanoseconds...) As long as each unit-of-work that is to be done requires only an insignificant amount of wall-time to achieve, this internally very-simple architecture works beautifully. So, I would start with that design and fully expect to also finish with it.