Come for the quick hacks, stay for the epiphanies. | |
PerlMonks |
Re: regex gotcha moving from 5.8.8 to 5.30.0?by mordibity (Acolyte) |
on Feb 10, 2021 at 18:20 UTC ( [id://11128192]=note: print w/replies, xml ) | Need Help?? |
Thank you all for your thoughts! I guess being coy isn't really helpful, I was just trying to avoid distractions. This is (really) part of a parser I've written for structural Verilog netlists in an IC design environment. The whole parser does much more work and handles many more nuances of the input format, but I've been happy enough (for years) with the performance without trying to speed it up further (eg sw1's whitespace handling tips). For example, a typical report on a typical netlist input file (200+Mb) would run in ~10 seconds. But then CAD upgraded our central Perl (from 5.8.8 to 5.30) and the same code, on the same input took 1 hour 38 minutes! (With the same output results, so it's purely a performance issue, not a correctness issue.) So then I set about cutting it down to a very small testcase with the 5 sec to 105 sec delta. AnomalousMonk: it's just plain ascii text; I tried throwing "aa" on it (for 5.30) and didn't change things. kschwab: below I've put a better fake-data generator to get a consistent 2.5x slowdown (still not as bad as the 10x on the real data, but hopefully more representative of the problem. The majority of the names would be unique strings, etc, etc). And also here's the un-foo-ified cut-down parser, too. SBECK: thank you! I guess I'll start looking at 5.20-related deltas. But I don't know that I'll be able to infer anything on my own; suggestions welcome from all on how to proceed (file a github issue, etc?) faker & parser are used like this:
makev.pl:
testv.pl:
In Section
Seekers of Perl Wisdom
|
|