in reply to Re: Re: Refining a 'vector space search'.

in thread Refining a 'vector space search'.

You can get away with a sparse representation of the vector, i.e. you can keep just the positions of the elements of the vector that are non-zero. Say the vector looks like:

than you can encode it by a list containing just:(0, 1, 0, 0, 0, 1, 0, 0, 1, 0)

This can be a huge space saver if the vector is really large. It is very easy to calculate the cosine between vectors in that representation: just count the number of common elements in the lists and divide by the square root of product of the length of each of the two lists.( 1, 5, 8 )

Hope this helps, -gjb-

In Section
Seekers of Perl Wisdom