There's a great discussion about summation algorithms in *Accuracy and Stability of Numerical Algorithms* by Nicholas Higham (2002), chapter 4. The focus is naturally on numerical accuracy, but there are lots of ideas and references for recursive, parallel and distributed summation algorithms. Or have a look at some of the CUDA resources by nVidia on speedy summation algorithms on GPUs.

I would say you really don't need to worry about either speed or accuracy of summation in "normal" circumstances. It becomes an issue only in a few special cases: 1) very large volumes of data (say array size of gigabytes or terabytes) typical in supercomputing and high-performance scientific or financial computing; 2) very strict numerical accuracy requirements (accuracy and speed are often opposites); 3) very resource limited computing environment (like a microcontroller). Of course, if you're summing four billion floating point numbers subject to relative error of the order of machine epsilon on a microcontroller, then you need to worry about these things...

