note
gjb
<p>You can get away with a sparse representation of the vector, i.e. you can keep just the positions of the elements of the vector that are non-zero. Say the vector looks like:
<code>
(0, 1, 0, 0, 0, 1, 0, 0, 1, 0)
</code>
than you can encode it by a list containing just:
<code>
( 1, 5, 8 )
</code>
This can be a huge space saver if the vector is really large. It is very easy to calculate the cosine between vectors in that representation: just count the number of common elements in the lists and divide by the square root of product of the length of each of the two lists.</p>
<p>Hope this helps, -gjb-</p>
269898
269907