http://qs321.pair.com?node_id=269903


in reply to Refining a 'vector space search'.

And the more unique words you have in all your combined documents, the more places/values you have. In one test, I calculated that there would currently be 22,000+ values in each vector/array (most are zeroes).

Was that count before or after stemming? When I played with vector space searching, stemming reduced the number of "words" significantly. But yes, you do end up with big vectors.