http://qs321.pair.com?node_id=1060081


in reply to Matching 2 unordered arrays then printing the first array highlighting the matches?

Frankly, I do not see why using an %already_seen hash would be more complicated than a grep on the content of the smaller file. Actually, when I started to learn Perl about 10 years ago, it took me just a few days to be able to set up a solution with a hash for this type of problem, I learned to use the grep function only a number of months later.

Using a hash is also algorithmically much more intellectually satisfying. Granted, with about 1,000 records, it does not make much of a difference, and, to tell the truth, I am also sometimes lazy. I just found today a piece of code (not in Perl, in a programming language having no built-in sort function) that I wrote less than a year ago in which I developed a simple 5-line bubble sort, despite its known inefficiencies, just because I knew that my lists to be sorted would usually have an average 2 to 3 elements and never more than 6. Worse than that, in that same language that I mentioned, I am regularly using a goto instruction, because it is in my experience the best way to exit forward of a code block when looping on the records of a database (I am using it exclusively as an equivalent of the next instruction in a Perl loop). I am mentioning these things just to show that I am really not a CS purist or bigot, I also like to keep things simple when there is no point of making them complicated. In the case in point, however, the idea of visiting 1882 times 1042 records when you can do it only once (or twice, depending on what you count) at no cost in terms of code simplicity with a hash really goes against every tenet that I have learned over my past 30 years of programming experience.