I agree. You're heading in a direction that relational databases and SQL cover quite well.
If you choose to continue on in Perl, you might get some mileage out of mapping your data rows into hashes.
while ( <DATA2> ) {
chomp;
my %record;
@record{qw(ID COMP TYPE DOC REF)} = split;
pushd @data2, \%record;
}
will give you an array of (references) to hashes that hold records from your second data file. (You'll need to fill in some code around this, but that should be obvious.) Then, to get an array of IDs you would do something like
@ids = map { $_->{'ID'} } @data2;
I'll leave the queries as an exercise. You should be able to piece them together based on information given so far in this thread.