algorithm for 'best subsets'

halley has asked for the wisdom of the Perl Monks concerning the following question:

I have a data structure which is similar to the following. The actual structure is not important, just that we're dealing with a number of "sets" of keywords, some of which are shared between items, and many of which are not.

%items = ( z => [ qw/one six/ ],
           y => [ qw/two three five/ ],
           x => [ qw/one two five/ ],
           ...
         );
[download]

Each item's keywords are unique. The arrays could just as easily be keys of hashes, but in this case, they just happen to be stored in arrays.

I can pretty easily scan the set of items for those with the most keywords, or scan for the set of keywords which appear in the most items.

What I would like to do is to figure out which pairs of keywords or triples of keywords or n-ples of keywords are shared by the most items. For example, if keyword A is found in ten items, and keyword B is found in twenty items, but only 1 item has both A and B, then it's a weak candidate. If nine items have both A and B, then it's a great candidate.

In Google terms, think of it this way: from the database of websites, what would be the best four-word query to include the most results links? And then the second-best four-word query? And the third-best four-word query? And so on...

Ideas? Does this align with some obscure or obvious algorithm I couldn't recognize?

Update: A test-case-generator function has been added below. Use this data if you'd like to do some benchmarking on your own.

Update 2: A sample output from an early timeline scanner has been added below.

--
[ e d @ h a l l e y . c c ]

Comment on algorithm for 'best subsets' Download Code

Replies are listed 'Best First'.
Re: algorithm for 'best subsets' by kvale (Monsignor) on Mar 03, 2005 at 01:14 UTC
As for algorithm, it seems like you are just looking for large correlations among keywords. Somtimes people look for large coorelations among keywords as a marker for eliminating redundant variables. If A and B always appear together, you don't need both. Such redunandcy reduction is often used in statisitcal modeling to arrive at an independent set of variables upon which to base a model. As for code to implement this measure, it is simple to use HoHs: `my %items = ( z => [ qw/one six/ ], 'y' => [ qw/two three five/ ], x => [ qw/one two five/ ], ); my %corr; foreach my $item (keys %items) { my $max = scalar @{$items{$item}} - 1; for my $first (0..$max) { for my $second ($first+1..$max) { $corr{ $items{$item}[$first] }{ $items{$item}[$second] }++; } } } for my $first (keys %corr) { for my $second (keys %{$corr{$first}}) { print "$first $second: $corr{$first}{$second}\n"; } }` [download] Update: Note that this isn't a full correlation calculation, but implements the OPs desired numbers for a 2-way measure. As tall_man pointed out, I blew it :) Here is the corrected code: `foreach my $item (keys %items) { my @set = sort @{$items{$item}}; for my $first (0..$#set) { for my $second ($first+1..$#set) { $corr{ $set[$first] }{ $set[$second] }++; } } } for my $first (sort keys %corr) { for my $second (sort keys %{$corr{$first}}) { print "$first $second: $corr{$first}{$second}\n"; } }` [download] -Mark	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by tall_man (Parson) on Mar 03, 2005 at 01:19 UTC
You need to make sure the keys appear in the same order in all the arrays, or you could miss a correlation. For example, if x had "five one two" instead, there would be nothing above a one in the answers.	[reply]
Re: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 03, 2005 at 03:26 UTC
This does all the pairs, triples, quads, quins and sextuplets in 6 seconds, producing 60,000+ lines in the process. With the 615,000 lines from 2 .. 10 keywords combinations taking around 2 1/2 minutes. My simple combinations generator chews a fair bit of memory though. An iterator would be prefereable. It relies on each item being representable by a single byte--which is okay upto 255 items if you map the real names to chars. The output format: `6 two four five nine seventeen eighteen => aekrsy` [download] says that the 6 keywords 'two', 'four', etc. all appeared in each of items a, e, k, r, s, & y. Read more... (6 kB) Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by Roy Johnson (Monsignor) on Mar 03, 2005 at 18:11 UTC
Shouldn't you have MIN instead of MAX? You are, after all, interested in the things with the most in common. Jumping on tall_man's idea to use Bit::Vector::Overload (and shamelessly stealing your data generator), here's a new solution. It's reasonably quick (about 15x faster than yours on my slow machine, though a chunk of the difference is printing time) to generate all the tuples and spit them out, nicely ordered by cardinality. There is much less output, because only tuples that actually represent the intersection of some pair of elements are included. When such a tuple is found, then the rest of the elements are checked to see if they should be included with it, so that the list for the tuple is complete. Read more... (3 kB) Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re^2: algorithm for 'best subsets' by halley (Prior) on Mar 03, 2005 at 14:41 UTC
I appreciate the technique, but the keywords are selected from the English language (and proper nouns), and the number of keyworded items could be in the tens of thousands, so I could not use any technique that limited the epsilon to 255~256 symbols. I'll be adding more information and test cases shortly. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^3: algorithm for 'best subsets' by fizbin (Chaplain) on Mar 03, 2005 at 15:30 UTC
Oh, eww. Many of the solutions so far will completely fall apart with a distinct keyword set that large. This problem just exploded. Really. Can you give an estimate on: K - the number of distinct keywords, I - the number of entries in the %items hash, L - the average number of keywords per %items entry, and N - the number of words taken at a time I suspect that many of these solutions will show either time or memory complexity of O(K!/(N!(K-N)!)) - I know that mine has the potential to show memory complexity like that (though the %totals has could be tied to a massive dbm file), and at least one of the other solutions has time complexity that looks like that. Of course, if L is sufficiently large, you're screwed too, since I don't think there's a way to not spend at least O(L!/(N!(L-N)!)) time. `-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/` [download]	[reply] [d/l]
Re^3: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 03, 2005 at 16:23 UTC
As long as you can represent the keywords by an integer by saying line number in a dictionary file (unsorted so that you can add new words to the end without remapping everything) or similar, then representing--permenantly, so that you do not have to regenerate the mappings each time--the items as bitvectors and ANDing them is going to be the quickest way to do the intersection. In effect, every document (item) would be indexed by a string that is a bitvector representing the (significant) words it contains. If you are excluding the usual stop words--those less than 3 chars, 'then', 'those', 'some' etc., then the length of your bitvectors should be reasonable. The problem comes with the combinatorics. By building an index of keywords to items, so that when you get a set of search terms, you can reduce the combinatorics to only those items containing the search terms will reduce the overall problem. You could try building a secondary index that maps pairs of keywords to items that contain them. That would rapidly reduce the size of the problem by limiting the combinations that need to be tested. But with the 2**(words in the English language + Proper nouns) you would need a huge database. If the problem was the usual "Given these keywords, find the (topN) documents that contain the most of them", it wouldn't be so bad, but you appear to want to generate the some sort of "best keyword sets", which I find confusing? I hate posts that ask me "What is it that you ultimately trying to achieve?", but given the shear scale of the numbers you are talking about, I think this may be a good time to ask such a question. Sorry :) Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]
Re: algorithm for 'best subsets' by Roy Johnson (Monsignor) on Mar 03, 2005 at 04:14 UTC
Update: Oh, boy, don't use this baby for high-order tuples. Here's my entry: Read more... (2 kB) Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re: algorithm for 'best subsets' by Limbic~Region (Chancellor) on Mar 03, 2005 at 03:19 UTC
halley, In between the commercials of Lost and Alias (literally), I came up with the following: Read more... (2 kB) I think it does what you want but it is quite rough around the edges. I will see about cleaning it up tomorrow. Do you happen to have a bigger sample so that I can benchmark? Cheers - L~R Update: Made minor cleanups to code	[reply] [d/l]
Re: algorithm for 'best subsets' by fizbin (Chaplain) on Mar 03, 2005 at 01:32 UTC
Ugly, off-the-top-of-my-head idea. This depends on having no space characters in the keywords - I suppose you can use \0 as a separator if you need to, but I found `' '` easier to type: `use Math::Combinatorics; my $nwordsatonce = 4; my ($k,$v); my %totals = (); local $" = ' '; # just in case while (($k,$v) = each %items) { next unless $nwordsatonce <= @$v; my @words = sort @$v; do {$totals{"@$_"} += 1;} for combine($nwordsatonce,@words); } my @comb = sort {$totals{$b} <=> $totals{$a}} keys %totals; print "Top $nwordsatonce - word combinations:\n"; do {print "$_\n";} for @comb[0..4];` [download] I'm a bit curious where this data is coming from. `-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/` [download]	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by fizbin (Chaplain) on Mar 03, 2005 at 19:50 UTC
And here's a `sub` version that I've actually been using for testing, using the test data code you posted. It finds 3-at-once in under 6 seconds and 4-at-once in under 11.5 seconds on my laptop, using the supplied test data. sub countcomb { my $nwordsatonce = shift; my ($k,$v); my %totals = (); local $" = ' '; # just in case while (($k,$v) = each %Items) { next unless $nwordsatonce <= @$v; do {$totals{"@$_"} += 1;} for combine($nwordsatonce,sort @$v); } my @comb = sort {$totals{$b} <=> $totals{$a} or $a cmp $b} keys %tot +als; my $topn = 5; my $toptot = $totals{$comb[$topn-1]}; while ($toptot <= $totals{$comb[$topn]}) {$topn++;} print "Top $nwordsatonce - word combinations: (cuttoff $toptot)\n"; do {print "$_ ($totals{$_})\n";} for @comb[0..$topn-1]; } [download] Unfortunately, it does have a tendency to die of out-of-memory errors if you up either the number of keywords or the average length of a set, and `tie`ing `%totals` to a db file doesn't seem to prevent it, so there must be some other sort of memory leak going on. (I suppose that it could also be the `sort` exploding, but working around that should be relatively straightforward) `-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/` [download]	[reply] [d/l] [select]
Re: algorithm for 'best subsets' by dimar (Curate) on Mar 03, 2005 at 06:42 UTC
Ideas? Does this align with some obscure or obvious algorithm I couldn't recognize? Hi halley, you may have already seen this one before, but since you asked for an algorithm (either obvious or obscure #but not both?#) here is a link to the entry identified as 1.5.1 Clique from The Stony Brook Algorithm Repository. This immediately came to mind upon reading your question. To see how this applies to your scenario, simply picture each 'keyword' as a node in an undirected graph, and each n-tuple combination represented by a (multi)edge. The clusters with the most connections thus represent your 'popular cliques' that you want to identify and single out for special treatment, abuse, retribution, or whatever (just like in high school ;-). Sorry, I've no ready-to-run perl implementation like those other remarkable chaps, but the link has some useful background for anyone curious. =oQDlNWYsBHI5JXZ2VGIulGIlJXYgQkUPxEIlhGdgY2bgMXZ5VGIlhGV	[reply]
Re: algorithm for 'best subsets' by tall_man (Parson) on Mar 03, 2005 at 09:01 UTC
Here is my entry, making use of some nice operations from Bit::Vector::Overload. It still has some rough spots in combining the final results, but it narrows things down very quickly. No combinatoric generators are used. Update: Revised to collect the groups much better. I believe it will now do what Roy Johnson suggested. I didn't change any hashes to arrays, though. Timing on my machine for a 676-item test case generated by the benchmark program halley did: `4.830u 0.000s 0:08.29 58.2% 0+0k 0+0io 420pf+0w` [download] Update2: I also tried a 17,576 item example (with 'kaaa' ... 'kzzz' and 'iaaa' .. 'izzz'). It ran for one hour to find all groups from 2 up to 4 (the maximum available in this case). The timing is consistent with O(I^2 * log K), where I is the item count and K is the keyword count. Update3: Inner loop optimization -- better ways to test for empty sets (is_empty) and count bits in sets (Norm). Went from one hour to 54 minutes on the biggest case. Read more... (3 kB)	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by Roy Johnson (Monsignor) on Mar 03, 2005 at 16:27 UTC
Wow. It's certainly interesting code (that's not a backhanded compliment). However, I think you figured combinations that weren't what the OP was looking for. You tell us what the intersection is of each tuple of keys: for example, b and c have two and five in common. I think what the OP wanted was the intersection of a tuple of values: two and five appear together in b, c, e, g, and h. If I can figure out what's what in your code, I will see if it can be made to do that. Also, you can use arrays instead of hashes for revipos and revkpos; then you can use @revipos everywhere you use sort keys %items (because that's all it is). Caution: Contents may have been coded under pressure.	[reply]
Re: algorithm for 'best subsets' by halley (Prior) on Mar 03, 2005 at 15:37 UTC
Here's an artificial test data generator, for those of you who are interested in doing your own benchmarking before publishing your methods. my %Items; sub build_test_data { # reproduceable case srand(12345); # Sorted by prevalence. Keyword 'kaa' is way more common than 'kz +z'. my @Keywords = 'kaa' ... 'kzz'; # Each node is associated with an asciibetical list of unique keyw +ords. # We groom out the top keywords which are basically noise. for my $xx ('iaa' .. 'izz') { my $count = int(rand(8)) + 4; $Items{$xx}{$Keywords[ int(rand()rand()@Keywords) ]}++ while $count--; delete $Items{$xx}{$_} for 'kaa'..'kab'; $Items{$xx} = [ sort keys %{$Items{$xx}} ]; } return unless @_; print Dumper \%Items; # lots of raw data! } build_test_data(); [download] Update: Here's a useful results format: `tuples of 3: 6 kaa kdf kea 6 kab kaf kka 4 kad kfa kfg ... tuples of 2: 9 kad kfa 8 kaj kda 8 kaj kda ...` [download] -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by fizbin (Chaplain) on Mar 03, 2005 at 16:36 UTC
Of course, to satisfy the "tens of thousands of keywords", you really should be using `'kaaa'..'kzzz'` and `'iaaa'..'izzz'`. Of course, when I did that my method blew through all available memory when trying to compute 5-at-a-time. It did manage 4-at-a-time, though, and in under five minutes. `-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/` [download]	[reply] [d/l] [select]
Re^3: algorithm for 'best subsets' by halley (Prior) on Mar 03, 2005 at 21:19 UTC
Exactly. The test case should show that the functionality worked, and that different methods found the same subsets. It should not try to break your machine. In practice, I can slice the %Items and the %Keywords up a bit, and do smaller overlapping datasets. I can also farm the work out to multiple machines on these slices. More on what that is, tonight; I've been stuck in meetings today so haven't had the chance to really explain what all this is about. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^4: algorithm for 'best subsets' by Limbic~Region (Chancellor) on Mar 03, 2005 at 21:27 UTC
Re^5: algorithm for 'best subsets' by fizbin (Chaplain) on Mar 03, 2005 at 22:57 UTC
Re^4: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 03, 2005 at 22:04 UTC
Re: algorithm for 'best subsets' by halley (Prior) on Mar 04, 2005 at 02:17 UTC
You ask, What in the hell has all this keyword data, and why would you want to find the queries with the widest hits? I'm working on reviving a personal project I started over twenty years ago, back in high school. I like to read and study timelines. That is, graphical maps which give some sort of contextual meaning to a set of events, by their ordering and relative pacing. So, as a quick proof of concept test, I have scraped a century's worth of Wikipedia pages which are organized by date. I make a node for each event that I scrape up. The event's keywords are naively assembled from the words that appear in the one-or-two-sentence summary of the event. `{ id => 6637, title => 'A Russian court sentences Fyodor Dostoevsky \ to death for anti-government activities linked \ to a radical intellectual group, but his \ execution is canceled at the last minute', epoch => 1, datum => '16-11-1849 AD', point => 1849.87440109514, kword => [ 'activities', 'anti', 'canceled', 'court', 'death', 'dostoevsky', 'execution', 'fyodor', 'government', 'group', 'intellectual', 'last', 'linked', 'minute', 'radical', 'russian', 'sentences' ], }` [download] So, now that I have a huge database of events, I'd like to find out the historical context. For example, if I found that `[ 'fyodor', 'dostoevsky' ]` happens to be found in a useful number of events, I might want to make a sub-line that includes all his events. With a rich enough database, someone reading the resulting timeline might connect his trial to his most recent publications. It's relatively simple to look for pairs that are always together. The goal is to cover the inevitable holes by looking at constellations of keywords. For example, "Fyodor Dostoevsky" may appear together many times, but should I only map out events that explicitly mention his first name? What if "Dostoevsky" also appears with "author" and "Russian" on a regular basis? Then the database can hint to me that "Fyodor" and "poet" may also be an appropriate association. I can then investigate, hand-tune, and save the interesting queries as important sub-timelines. What's even more interesting is to show multiple, seemingly unconnected sub-timelines. Fyodor was tried in Russia. Who was the Czar during that period? Who was head of state in Poland and France? Was this before or after Harriet Beecher Stowe's "Uncle Tom's Cabin"? Given the potentially huge database, and the nature of historical influences, I know that I will have to work with only a few years at a time, or a few words at a time, or both. Solutions which use big memory are going to crash. Solutions which use little memory, can run for days, but can handle larger datasets are clearly winners in this application. I've been quite pleased with the involvement of the community here. I tossed out the question on a whim, and have been too busy today to reply to all your helpful responses as fast as I'd like. I'll definitely be toying with multiple approaches here to find the best balances and data analysis capabilities. An early positive result from my timeline query mechanism (you may need to adjust for wider output): Seeking for '+ford automobile model'... found 6 matches. Seeking for '+wright brothers orville wilbur'... found 2 matches. Seeking for 'boer +boers'... found 3 matches. Laying out 3 queries, and 11 events. Need a maximum of 2 lines. 11-10-1899 AD ^ Boer War begins: In South Africa, a war between the \| United Kingdom and the Boers of the Transvaal and Or +ange \| Free State erupts \| 23-02-1900 AD \| Boer War: Battle of Hart's Hill - In South Africa th +e \| Boers and British troops battle \| 10-03-1902 AD v Boer War: South African Boers win their last battle +over British forces, with the capture of a British genera +l and 200 of his men 23-07-1903 AD ^ Dr. Ernst Pfenning of Chicago, Illinois becomes the +first \| owner of a Ford Model A \| 17-12-1903 AD ^ \| Orville Wright flies aircraft with a petrol engine i +n \| \| first documented successful controlled powered \| \| heavier-than-air flight \| \| 07-11-1910 AD v \| First air flight for the purpose of delivering comme +rcial \| freight occurs between Dayton, Ohio and Columbus, Oh +io by \| the Wright Brothers and department store owner Max \| Moorehouse \| 03-11-1911 AD * Chevrolet officially enters the automobile market to \| compete with the Ford Model T \| 27-05-1927 AD * Ford Motor Company ceases manufacturing Ford Model T +s and \| begins to retool plants to make Ford Model As \| 02-12-1927 AD * Following 19 years of Ford Model T production, the F +ord \| Motor Company unveils the Ford Model A as its new \| automobile \| 13-01-1942 AD * Henry Ford patents a plastic automobile, which is 30 +% \| lighter than a regular car \| 17-02-1972 AD v Sales of the Volkswagen Beetle model exceed those of + Ford Model-T (15 million) [download] Eventually, I'll want to approach the Wikimedia folks with some results from their database, but the capability works for any sort of event timeline, from minute-by-minute tracking of space mission procedures, to the astronomic ages which explain how the Earth formed from star stuff. -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l] [select]
Re^2: algorithm for 'best subsets' by tall_man (Parson) on Mar 04, 2005 at 18:04 UTC
A divide-and-conquer approach may help you. There is a very fast algorithm (nearly order N) called UnionFind which will divide the problem into groups of items that have no keys in common. Then each partition could be treated separately. The module Graph::UnionFind implements it. Here is an example. ~~It finds 153 partitions of the 17,576 node problem in a minute and 26 seconds on my machine.~~ It finds 129 partitions in a minute and 14 seconds. Update: Fixed a bug in the code, and added code to separate out the partitions and count the length of each. Read more... (3 kB)	[reply] [d/l]
Re^3: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 04, 2005 at 19:59 UTC
Could you explain the format of `%union_data`? What do the numeric keys represent> Item number? What do the bitstring values represent? Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]
Re^4: algorithm for 'best subsets' by tall_man (Parson) on Mar 04, 2005 at 20:30 UTC
Re^5: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 04, 2005 at 21:11 UTC
Some notes below your chosen depth have not been shown here
Re^3: algorithm for 'best subsets' by halley (Prior) on Mar 05, 2005 at 04:16 UTC
I've spent a good couple hours this evening trying to figure out the intent of this code, and actually using my own data. I'm not there yet. I understand Bit::Vector, and making vectors that are the width of the keyword table, and marking the keyword bits for a given item. You don't seem to need the first gang of bit vectors at all; you are marking item bits in item-wide vectors, then never using those vectors. I understand what I read in Graph::UnionFind, but not its algorithm internally, but I might not need to. I think I understand the output it should give. What I don't understand is the way you're trying to combine these methods, and second-guessing the Graph::UnionFind's results on each loop. It seems to me that I can use G::UF without bit vectors at all, adding edges between correlated keywords, and then scan each partition for what keywords are in each partition. Can you speak more to your reasoning? -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^4: algorithm for 'best subsets' by tall_man (Parson) on Mar 05, 2005 at 15:12 UTC
Re^5: algorithm for 'best subsets' by halley (Prior) on Mar 05, 2005 at 15:57 UTC
Some notes below your chosen depth have not been shown here
Re: algorithm for 'best subsets' by BrowserUk (Patriarch) on Mar 05, 2005 at 15:19 UTC
Update: Indeed, the same holds true for 50 other settings of srand. Halley I think you will have to improve your dataset generator. Currently, with srand = 12345 and rand function on my platform, it doesn't produce a single pair of items that share more than one keyword. As you can see below, with the exception of two partitions which only contain a single item each, the maximum number of keywords shared by items in each partition is 1: (Output right truncated for posting) [15:08:03.59] P:\test>436050-2 Keywords: 649 Items: 676 Dataset built. Beginning partition... 675 93 partitons found. Max. 1 keywords shared; 24 members:[ 293 296 300 331 343 351 352 35 +6 436 455 Max. 1 keywords shared; 7 members:[ 397 402 456 592 613 640 642 ] Max. 1 keywords shared; 17 members:[ 66 73 94 127 175 194 205 229 3 +39 365 51 Max. 1 keywords shared; 7 members:[ 28 113 139 348 394 445 543 ] Max. 1 keywords shared; 8 members:[ 416 470 631 659 666 668 673 67 +5 ] Max. 1 keywords shared; 18 members:[ 26 36 49 117 228 273 301 321 3 +33 334 37 Max. 1 keywords shared; 9 members:[ 21 34 38 114 204 350 362 388 5 +00 ] Max. 1 keywords shared; 20 members:[ 107 121 153 167 172 257 278 32 +5 377 387 Max. 1 keywords shared; 13 members:[ 210 215 282 290 297 310 336 40 +3 448 459 Max. 1 keywords shared; 11 members:[ 96 159 234 312 327 378 385 405 + 471 522 Max. 1 keywords shared; 11 members:[ 58 75 221 256 298 303 335 355 +530 601 6 Max. 1 keywords shared; 2 members:[ 662 667 ] Max. 1 keywords shared; 6 members:[ 12 19 372 408 532 634 ] Max. 1 keywords shared; 14 members:[ 122 138 143 198 340 366 393 44 +9 461 484 Max. 1 keywords shared; 10 members:[ 16 18 68 91 160 329 431 462 46 +4 554 ] Max. 1 keywords shared; 2 members:[ 501 516 ] Max. 1 keywords shared; 6 members:[ 268 280 295 582 625 644 ] Max. 1 keywords shared; 23 members:[ 4 5 7 10 95 134 135 158 189 27 +2 291 326 Max. 1 keywords shared; 5 members:[ 208 235 241 452 641 ] Max. 1 keywords shared; 3 members:[ 342 374 616 ] Max. 1 keywords shared; 10 members:[ 239 247 369 407 427 443 523 59 +3 600 626 Max. 1 keywords shared; 28 members:[ 22 33 56 59 85 92 132 146 161 +171 203 2 Max. 1 keywords shared; 10 members:[ 126 190 294 341 390 483 538 59 +1 598 671 Max. 1 keywords shared; 8 members:[ 15 25 101 285 440 488 509 528 +] Max. 1 keywords shared; 5 members:[ 223 244 322 337 446 ] Max. 1 keywords shared; 5 members:[ 368 584 637 646 674 ] Max. 1 keywords shared; 10 members:[ 184 188 224 283 287 316 328 34 +5 406 633 Max. 1 keywords shared; 4 members:[ 150 163 376 410 ] Max. 1 keywords shared; 7 members:[ 364 411 412 435 535 540 632 ] Max. 1 keywords shared; 5 members:[ 93 118 148 226 534 ] Max. 1 keywords shared; 20 members:[ 142 155 165 173 193 214 217 27 +4 299 358 Max. 1 keywords shared; 7 members:[ 48 87 246 289 475 638 639 ] Max. 4 keywords shared; 1 members:[ 557 ] Max. 1 keywords shared; 2 members:[ 413 423 ] Max. 1 keywords shared; 9 members:[ 263 309 379 383 414 420 502 51 +7 622 ] Max. 1 keywords shared; 6 members:[ 232 250 421 492 519 526 ] Max. 1 keywords shared; 3 members:[ 110 119 620 ] Max. 1 keywords shared; 2 members:[ 42 65 ] Max. 1 keywords shared; 4 members:[ 277 279 441 563 ] Max. 1 keywords shared; 6 members:[ 381 400 415 573 586 636 ] Max. 1 keywords shared; 9 members:[ 47 51 79 86 197 370 418 434 61 +7 ] Max. 1 keywords shared; 4 members:[ 363 395 474 624 ] Max. 1 keywords shared; 12 members:[ 74 88 128 181 196 237 323 392 +432 529 6 Max. 1 keywords shared; 2 members:[ 318 324 ] Max. 1 keywords shared; 14 members:[ 63 64 84 115 157 209 211 306 4 +04 450 49 Max. 1 keywords shared; 16 members:[ 1 2 27 50 83 131 137 156 169 1 +83 206 25 Max. 1 keywords shared; 2 members:[ 386 562 ] Max. 1 keywords shared; 3 members:[ 425 438 561 ] Max. 1 keywords shared; 6 members:[ 422 466 478 537 548 650 ] Max. 1 keywords shared; 9 members:[ 111 120 187 201 225 354 533 61 +0 645 ] Max. 1 keywords shared; 3 members:[ 457 524 579 ] Max. 1 keywords shared; 6 members:[ 11 17 55 152 177 179 ] Max. 1 keywords shared; 14 members:[ 8 14 35 52 69 71 186 248 473 5 +96 606 60 Max. 1 keywords shared; 2 members:[ 486 496 ] Max. 1 keywords shared; 7 members:[ 192 240 284 338 353 581 657 ] Max. 1 keywords shared; 3 members:[ 499 521 590 ] Max. 1 keywords shared; 4 members:[ 43 45 359 511 ] Max. 1 keywords shared; 8 members:[ 141 149 154 170 182 207 253 49 +7 ] Max. 1 keywords shared; 15 members:[ 89 90 97 166 202 216 251 255 2 +58 288 31 Max. 1 keywords shared; 6 members:[ 30 39 70 123 213 262 ] Max. 1 keywords shared; 7 members:[ 54 62 230 254 266 317 660 ] Max. 1 keywords shared; 3 members:[ 136 271 588 ] Max. 1 keywords shared; 7 members:[ 100 105 168 191 276 621 627 ] Max. 1 keywords shared; 8 members:[ 124 200 236 302 319 398 447 65 +5 ] Max. 1 keywords shared; 3 members:[ 140 178 578 ] Max. 1 keywords shared; 8 members:[ 23 41 46 60 77 162 245 357 ] Max. 1 keywords shared; 4 members:[ 261 270 286 542 ] Max. 1 keywords shared; 4 members:[ 467 515 556 612 ] Max. 5 keywords shared; 1 members:[ 577 ] Max. 1 keywords shared; 3 members:[ 314 487 491 ] Max. 1 keywords shared; 2 members:[ 371 479 ] Max. 1 keywords shared; 6 members:[ 482 551 565 570 652 672 ] Max. 1 keywords shared; 8 members:[ 133 147 176 195 218 222 307 45 +3 ] Max. 1 keywords shared; 2 members:[ 305 344 ] Max. 1 keywords shared; 7 members:[ 61 72 76 98 99 199 566 ] Max. 1 keywords shared; 3 members:[ 6 9 347 ] Max. 1 keywords shared; 3 members:[ 151 174 219 ] Max. 1 keywords shared; 3 members:[ 227 231 361 ] Max. 1 keywords shared; 2 members:[ 242 249 ] Max. 1 keywords shared; 7 members:[ 31 32 37 40 108 267 346 ] Max. 1 keywords shared; 7 members:[ 0 57 80 130 144 180 212 ] Max. 1 keywords shared; 4 members:[ 428 437 454 628 ] Max. 1 keywords shared; 2 members:[ 477 558 ] Max. 1 keywords shared; 4 members:[ 116 129 503 574 ] Max. 1 keywords shared; 7 members:[ 24 67 81 104 260 494 575 ] Max. 1 keywords shared; 5 members:[ 103 106 112 360 564 ] Max. 1 keywords shared; 3 members:[ 308 330 419 ] Max. 1 keywords shared; 11 members:[ 53 82 109 125 252 275 401 506 +510 531 6 Max. 1 keywords shared; 3 members:[ 367 536 595 ] Max. 1 keywords shared; 9 members:[ 102 145 304 429 451 463 465 49 +3 623 ] Max. 1 keywords shared; 10 members:[ 3 13 20 29 44 78 220 489 504 6 +61 ] Max. 1 keywords shared; 4 members:[ 164 185 292 614 ] Max. 1 keywords shared; 3 members:[ 525 571 572 ] validating nonoverlap 91 : 92 [15:08:04.54] P:\test> [download] Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]

Back to Seekers of Perl Wisdom