Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: decomposing binary matrices

by BrowserUk (Patriarch)
on Feb 16, 2007 at 20:39 UTC ( [id://600519]=note: print w/replies, xml ) Need Help??


in reply to decomposing binary matrices

At the risk of wasting your time a second time, I'm pretty sure that this method works. The method is based upon a half remembered technique for simplifying complex boolean conditions that I learnt at college and have never used since. You draw up a truth table for states and then swap whole columns or whole rows until the true values group together. I seem to recall that it didn't matter how many times you swapped things around so long as you always swapped whole columns or whole rows at a time

My POC below basically consists of sorting the matrix both horizontally and vertically, and then picking out the sets with a equal number of 'leading zeros'. The problem I had was tracking the positions the 1's came from. The proper way would be to use a matrix of anonymous arrays (or objects), each containing the row, column and boolean value. For simplicity, I used arrays of strings, replacing each '1' by the letter of it's starting column, and appending the row numbers (starting at 1) to each string. I then sort these strings, then tranforms the 'matrix' into another set of strings and sort again. AT this point you extract and retain the 'row numbers' item from the array.

What you end up with is the smallest group (least 1s/most 0s) sorted to the top. You count the number of leading zeros in group (min of the first two items) and select all the items with at least that number of leading zeros and place them into the first group. Then get the min leading zeros of the first two of the remaining items, and repeat. The final group will be the 'leftovers'.

The following shows the process step by step for the OP example.

c:\test>600418-2.pl This input 00CD01 A0CDE2 0B0D03 AB00E4 0BCD05 sorted 00CD01 0B0D03 0BCD05 A0CDE2 AB00E4 Tranformed looks like this 000AA 0BB0B C0CC0 DDDD0 000EE 13524 sorted 000AA 000EE 0BB0B 13524 C0CC0 DDDD0 These are the sets where the letters denote the columns of the origina +l matrix (0 mean column not used in this set). And the numbers above, the rows they came from. 13524 ## Column numbers from the original data. Trim leading zeros. 000AA ## Read this as; Group one contains 000EE ## columns A & E from rows 2 & 4. 13524 ## Leftovers group. Trim trailing columns used by earlier group +s. 0BB0B ## Columns B, C, & D from Rows 1, 3 & 5. C0CC0 DDDD0

This works as is for the other examples you've supplied (see commented out data blocks).

I do remember from those far off college days that for some complex examples, it was possible for the grouping to 'wrap over the edges'. That is, if you draw the matrix on paper, and wrap it into a cylinder to bring the left and right edges together, a group could cross the boundary. The same was possible for the top & bottom edges.

I think that this would be catered for by repeating the sort/transform/sort and select process on the smaller groups, but I haven't taken it that far yet. Unless your matrices are huge, repeating the process to subdivide the smaller groups shouldn't be a problem.

#! perl -slw use strict; sub xform { my @out; for my $in ( @_ ) { $out[ $_ ] .= substr $in, $_, 1 for 0 .. length( $in ) -1; } return @out; } my @grid = ( "00CD01", "A0CDE2", "0B0D03", "AB00E4", "0BCD05", ); #my @grid=( "000DEF1", "000DEF2", "000DE03", "ABCDEF4", "ABCDEF5", "AB +0DEF6", ); #my @grid=( "000DE01", "000D0F2", "0000EF3", "AB00004", "A0C0005", "AB +00006", ); #my @grid=( "0B0D001", "000DE02", "0B00E03", "A0000F4", "A0C0005", "A0 +000F6", ); print "This input\n"; print "\t$_" for @grid; @grid = sort @grid; print "sorted\n"; print "\t$_" for @grid; my @xformed = xform @grid; print "\nTranformed looks like this\n"; print "\t$_" for @xformed; @xformed = sort @xformed; print "sorted\n"; print "\t$_" for @xformed; ## extra column label item. my( $label ) = grep /1/, @xformed; @xformed = grep !/1/, @xformed; my @subsets; while( @xformed ) { my $mask = $xformed[ 0 ] | $xformed[ 1 ]; if( $mask =~ m[(^0+)] ) { my $count = length( $1 ); ## Length of common zero prefix push @{ $subsets[ @subsets ] }, shift( @xformed ), shift( @xfo +rmed ); while( @xformed and $xformed[ 0 ] =~ m[(^0+)] and length( $1 ) == $count ) { push @{ $subsets[ -1 ] }, shift @xformed; } } else { push @subsets, []; push @{ $subsets[ -1 ] }, shift @xformed while @xformed; } } print <<'EOS'; These are the sets where the letters denote the columns of the origina +l matrix (0 mean column not used in this set). And the numbers above, the rows they came from. EOS print join "\n", $label, @$_, "\n" for @subsets;

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: decomposing binary matrices
by hv (Prior) on Feb 16, 2007 at 22:56 UTC

    I think that approach has possibilities, though it might need some modification - there is no guarantee that the row with fewest bits is separable. Consider, say:

    A: 1 1 0 0 0 B: 1 0 1 1 0 C: 1 0 1 1 0 D: 1 0 1 1 0 E: 1 1 1 1 1
    .. from which {B, C, D} is a separable submatrix. I'd have to think further whether I can easily construct an example that defeats the approach on both rows and columns simultaneously - it'd certainly need a larger matrix (at least 7x7 I think), but I think the same general idea could throw up a proper counterexample.

    At the risk of wasting your time ...

    Not a waste by any means - many of the suggestions so far could maybe be adaptable to a full solution. It's just a question of whether the adapted version would end up any quicker than going brute force in the first place. :)

    Hugo

      I just ran that example through the code I posted (Note: My code uses letters for columns not rows as you have). The groupings produced:

      23415 0000E 000BB 23415 AAAAA CCC0C DDD0D

      gives rows 1 & 5 (your A & E), as a group which when subtracted from the remainer leaves rows 2, 3 & 4 (your {BCD}) as a group which is what you are after?

      Full output from the unchanged program (except for inserting the new data (first line below)):

      # my @grid = ( "AB0001","A0CD02","A0CD03","A0CD04","ABCDE5" ); c:\test>600418-2.pl This input AB0001 A0CD02 A0CD03 A0CD04 ABCDE5 sorted A0CD02 A0CD03 A0CD04 AB0001 ABCDE5 Tranformed looks like this AAAAA 000BB CCC0C DDD0D 0000E 23415 sorted 0000E 000BB 23415 AAAAA CCC0C DDD0D These are the sets where the letters denote the columns of the origina +l matrix (0 mean column not used in this set). And the numbers above, the rows they came from. 23415 0000E 000BB 23415 AAAAA CCC0C DDD0D

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Sorry, I transformed the data already, and included more than necessary. Try this instead (in which, again, {B, C, D} yield a separable submatrix):

        my @grid = ( "ABCD1", "A0002", "0BCD3", "0BCD4");

        Hugo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://600519]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-20 05:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found