http://qs321.pair.com?node_id=1160643


in reply to How to optimize a regex on a large file read line by line ?

How long are the lines in your file? and how many lines is it reading in total? Maybe reading it a line at a time is not the best approach for your data set.

Replies are listed 'Best First'.
Re^2: How to optimize a regex on a large file read line by line ?
by John FENDER (Acolyte) on Apr 16, 2016 at 14:59 UTC
    How long ? Well, it's could vary regarding the extract you can make and the data you would analyze. Some logs are huges, more than 2Gbs... For starting 10000000 lines for passwords log 185866729 lines for the dictionnary file The entry are not very long, nothing more than 8 or 16 chars i would say.

      There's no point trying to optimize your code if you're not sure what your data looks like. However index will be faster than a regex if you're only looking for a fixed string.

      As other people have recommended, profile your code and find out where the time is going.