Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix

by gone2015 (Deacon)
on Aug 25, 2009 at 09:06 UTC ( [id://791019] : note . print w/replies, xml ) Need Help??


in reply to Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix

I'm guessing the array is held in rows. I suggest reading your the non-zeros of the matrix into an array of arrays and sorting by column & row number, along these lines (untried, untested):

my @sparse = () ; while (...whatever...) { my $value = ... ; # next value my $col = ... ; # in this column my $row = ... ; # and this row if ($value) { # keep only the non-zeros push @sparse, [$col, $row, $value] ; } ; } ; @sparse = sort { ($$a[0] <=> $$b[0]) || ($$a[1] <=> $$b[1]) || die } + @sparse ;
The array of arrays can now be scanned to get the cumulative column count Ap, and is in the order you require for Ai and Ax.

If there are too many non-zero values to be held in memory, then you have a serious problem -- but you could partition it by reading the input 'n' times, processing a batch of columns each time.

If 'n' is big, then you'll need to consider ways to reduce the size of the array of arrays... but that's going to require a deeper understanding of the original data.

Update: corrected the sort line and added proof of concept code:

use strict ; use warnings ; my @sparse = () ; my $col = 0 ; # emerges from loop set to # of columns my $row = 0 ; # emerges from loop set to # of rows while (<DATA>) { s/\s+/ /g ; s/^ // ; s/ $// ; $col = 0 ; foreach my $val (split / /) { if ($val) { push @sparse, [$col, $row, $val] ; } ; ++$col ; } ; ++$row ; } ; @sparse = sort { ($$a[0] <=> $$b[0]) || ($$a[1] <=> $$b[1]) || die } @ +sparse ; my $Ap = 0 ; print "Ap = $Ap" ; my $c = 0 ; foreach my $a (@sparse) { while ($c < $$a[0]) { print " $Ap" ; ++$c ; } ; ++$Ap ; } ; while ($c++ < $col) { print " $Ap" ; } ; print "\n" ; print "Ai =" ; foreach my $a (@sparse) { print " $$a[1]" ; } ; print "\n" ; print "Ax =" ; foreach my $a (@sparse) { print " $$a[2]" ; } ; print "\n" ; __DATA__ 2 3 0 0 0 3 0 4 0 6 0 -1 -3 2 0 0 0 1 0 0 0 4 2 0 1
output:
Ap = 0 2 5 9 10 12
Ai = 0 1 0 2 4 1 2 3 4 2 1 4
Ax = 2 3 3 -1 4 4 -3 1 2 2 6 1