I'm binning data into segments and then want to retrive a subset of the data from a run of sequential bins. Data is going into the system fine, but not every bin has multiple values so when I iterate through the results after setting the cursor using get_dup, only keys with multiple values are returned by the cursor.
Below is a code which illustrates the problem. Is this a DB_File/BDB 1.x limitation/feature that is better addressed with BerkeleyDB interface and the more robust cursors?
$DB_BTREE->{'flags'} = R_DUP;
$DB_BTREE->{'compare'} = \&_compare;
my %btree;
my $bhandle = tie %btree, 'DB_File', undef, O_RDWR|O_CREAT, 0640, $DB_
+HASH;
my $len = 26;
my @array = ( 'a'..'z' );
foreach ( 1..$len ) {
$btree{$_} = shift @array;
}
# add a second value to each so that each key has duplicate values
@array = ( 'A'..'Z' );
foreach ( 1..$len ) {
$btree{$_} = shift @array;
}
# test to see that each value is printed from 20 - end
my @v = $bhandle->get_dup(20);
print "v is @v 20\n";
while( $bhandle->seq($k,$v, R_NEXT) == 0 ) {
my @v = $bhandle->get_dup($k);
print "$k @v\n";
}
# now associate a single value with a key
$btree{22.5} = 'HHI';
# test to see that each value is printed from 20 - end
my @v = $bhandle->get_dup(20);
print "v is @v 20\n";
while( $bhandle->seq($k,$v, R_NEXT) == 0 ) {
my @v = $bhandle->get_dup($k);
print "$k @v\n";
}
# 22.5 does not show up
# add a second value for 22.5
$btree{22.5} = 'JKL';
# test to see that each value is printed from 20 - end
my @v = $bhandle->get_dup(20);
print "v is @v 20\n";
while( $bhandle->seq($k,$v, R_NEXT) == 0 ) {
my @v = $bhandle->get_dup($k);
print "$k @v\n";
}
# now 22.5 is in the list
The best workaround I have thought of will be to dump all the keys, find the bin that is closest to where I want to start O(log(n) (since list will be sorted), and walk through the list until reaching end boundary condition, calling get_dup on each key in the subset (which still works if only one value is stored for the key).
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.