So in short you have a static 5G dataset, that you need to search frequently.
I think your best bet would be use a database to index the data, and let it worry about how to create an optimised index.
I would put the entire file contents into the database, and discard the original file. If each line also contains lots of other stuff that you will not be searching on, then I would still keep it in the database, but I would put it in a different collum without an index so as not to bloat the database to much.