All you need is for scanning that string to be in an inner loop and you are screwed even on modern hardware. The questioner is asking about making split faster and has presumably already found that to be the limiting step in his program. There are already benchmarks in this discussion showing that the split builtin is about 4–5 times faster than any of the Perl equivalents the monks have suggested so far.
If this is in an inner loop, yes, even a small difference could be the difference between "good enough" and user complaints that "it takes too long". The solution is probably to reorganize the program to ensure that the string is only scanned once, but if that is already the case, we may have a situation where data is arriving too fast for a single node to handle and we need to scale out into a cluster. The first step in that direction is always the hardest because adding the coordination overhead means you may need two or three nodes just to match what one node could previously handle. Of course you can then add more nodes once you have paid that cost, but this is getting way into "XY Problem" territory for this discussion.
|