XP is just a number | |
PerlMonks |
Re: About text file parsingby bliako (Monsignor) |
on Aug 29, 2018 at 10:52 UTC ( [id://1221297]=note: print w/replies, xml ) | Need Help?? |
If you insert the file in a RAM disk The additional benefit if you go the RAM disk way is that you can keep your input files in the disk for multiple perl runs, until the next reboot or until you remove them from RAM. So the second time you run a similar script to find different patterns you will see a better time benefit because the input is already in RAM. If you go the parallel way (as Discipulus mentioned) then you are bound by the total IO bandwidth of your hard disk. And so the benefits also may be different than just multiplying by the If your Not sample CC lines are a lot then you can filter them out before running all the different regexes on each line of input or even before running that perl script: for example, via grep -v 'Not sample CC' input.txt | perl ... or with a perl one-liner filter, but I am not sure perl beats grep. Of course you need the filter-out lines to have a common regex to filter them out. And finally, if you do manage to remove all the Not sample CC lines, it is worth trying the following and see if it is faster (caveat: results in %inp will be in random order and not the order of insertion as with arrays) :
Edit: If you want to pass the output of your command above to another command for further processing then the problem of waiting for a process to finish in order to get all its output out and run it through another command and so on has been solved a long time ago, it is called a pipeline and essentially is what you see in unix style cmd1 | cmd2 | cmd3 ... . cmd1 starts outputing results as soon as it reads its input (if it is a simple program as yours above), its output is immediately read by cmd2 which then spits its output as soon as the first line is read and on to cmd3 which finally gives you an output as soon as the first line of input is read by cmd1 plus the propagation time. So you save a lot of time and you have results coming out almost immediately. The provision is that processing one line or chunk of input must be independent of the following lines of input.
In Section
Seekers of Perl Wisdom
|
|