http://qs321.pair.com?node_id=11128187


in reply to regex gotcha moving from 5.8.8 to 5.30.0?

I'm assuming the test data and sample code reproduces the slowness, but doesn't really show all the requirements.

Is that right?

Because looking at the generated test data, it seems like there should be some much faster way of parsing it. Assuming, for example, the file has important bits already newline separated, it feels like a simpler line-by-line read watching for /^begfoo/ and /^endfoo/ and keeping track of state would be faster.

Can you provide a more real-world view of what's being done, or is it just too much test data and code to present here?

Edit: Not dismissing the importance of the performance regression...that's pretty bad.