From the code, I think any string will do as a salt. All it's doing is passing it as the second argument to sha1(), and all that does is append it to the first argument before hashing. The reason it talks about salts plural is that instead of using different hashing algorithms, it gets its vector coordinates by reusing the same algorithm with a different addendum. Throw some arbitrary but consistent four letter words at it and see what happens.
More generally, it seems like a neat approach but will require at least two passes to really work. Your second pass will have to weed out the false positives by rescanning the data set with a simple hash-based tally restricted to the output of the first pass. In that case you can choose the accuracy based on the number of false positives you can afford to screen for, and even 0.1% would give you a fairly manageable hash of 30,000 keys.
Incidentally, I'm no pack() leader, but if I were you I would think about rolling your own using only some bits of Bloom::Filter. It's biased towards working with email addresses, which are very heterogeneous, and there may well be regularities in your data set (they are IDs, after all) that make other algorithms much more efficient and discriminatory than just applying and reapplying the same hashing routine.
update: according to the perl.com article, with 30million keys and 0.1% accuracy, you need 10 hash functions to get the most memory-efficient filter. For 0.01% it's 13 functions and 30% more memory use, and so on:
my ( $m, $k ) = optimise(30000000, 0.0001);
print <<"";
memory usage = $m
hash functions = $k
sub optimise {
my ( $num_keys, $error_rate ) = @_;
my $lowest_m;
my $best_k = 1;
foreach my $k ( 1..100 ) {
my $m = (-1 * $k * $num_keys) /
( log( 1 - ($error_rate ** (1/$k))));
if ( !defined $lowest_m or ($m < $lowest_m) ) {
$lowest_m = $m;
$best_k = $k;
}
}
return ( $lowest_m, $best_k );
}