|go ahead... be a heretic|
The only problem with using array rather than hashes, is that if, for example, all your line identifiers start with '0030nnnn', then using an array, you would have space allocated to 300,000 elements 00000000 .. 000299999 which would never be used, but would take up space. (This is what I meant above by "if your numbers are low and mostly sequential".).
In this case, you would be much better off using hashes as a "sparse array". The same is true for your station numbers. With just three stations number 1250..1252 on the line 000301038, using hashes will definitely save you much memory.
Note also that I made an error (pointed out by alexm in the post following mine) when I typed:
It should be
Or, if you go with hashes as I think you probably should having seen the real data:
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.