In a project like this I really recommend going with a SQL
database of some type. (MySQL and Postgres are both viable
free alternatives see
this column for a good overview of each's strengths and
weakness's. Also, see
this for a good intro to some of the things you need to
be aware of like data modeling and so forth.
If you want to do cross referencing, SQL offers a powerful
methodology for that type of thing. SQL in general makes it
very easy to do very complex tasks with very few words. I
would recommend SQL not merely for performance issues but
for the degree to which it will help you deal with some of
the complexity it sounds like you want to work into this
project.
Good Luck,
Mark