Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: tying a hash from a big dictionary

by repellent (Priest)
on Nov 01, 2011 at 09:23 UTC ( [id://935067]=note: print w/replies, xml ) Need Help??


in reply to tying a hash from a big dictionary

You can try accessing the dictionary file directly using the Search::Dict core module, assuming your dictionary is sorted. It performs a binary search through the file. Here, I've wrapped its functionality into an OO-module for convenience:
use Data::Dumper; use Search::Dict::Object; my $d = Search::Dict::Object->new( file => "/tmp/dict.txt", keyval_xfrm => sub { split /\t/ }, comp => sub { $_[0] cmp $_[1] }, # should correspond to file sort +order ); print Dumper { aaa => $d->get('aaa'), foo => $d->get('foo'), bar => $d->get('bar'), baz => $d->get('baz'), zzz => $d->get('zzz'), }; __END__ $VAR1 = { 'bar' => '789', 'baz' => '456', 'aaa' => undef, 'foo' => '123', 'zzz' => undef };

The dictionary file:
$ cat /tmp/dict.txt aho 234 bar 789 bat 567 baz 456 cut 678 foo 123 yyy 000

The Search::Dict::Object package:
package Search::Dict::Object; use warnings; use strict; use Search::Dict (); sub new { my $class = shift; my $self = bless { }, $class; $self->_init(@_); return $self; } sub _init { my $self = shift; %{ $self } = @_; unless (defined($self->{FH})) { my $file = $self->{file} || "<unspecified file>"; open($self->{FH}, "<", $file) or die("Cannot open: $file", "\n ", $!); } } sub get { my ($self, $key) = @_; return undef unless defined($key); my $FH = $self->{FH}; my $comp = $self->{comp} || sub { $_[0] cmp $_[1] }; my $keyval_xfrm = $self->{keyval_xfrm} || sub { $_ => $_ }; my $opts = { comp => $comp, $self->{keyval_xfrm} ? (xfrm => sub { chomp($_[0]); (map $keyval_xfrm->(), $_[0]) +[0] }) : (), }; if (Search::Dict::look($FH, $key, $opts) != -1) { my $entry = <$FH>; return undef unless defined($entry); chomp($entry); my ($k, $v) = map $keyval_xfrm->(), $entry; return $v if $comp->($k, $key) == 0; } return undef; } sub DESTROY { my $self = shift; close($self->{FH}) if $self->{FH}; } 1;

Replies are listed 'Best First'.
Re^2: tying a hash from a big dictionary
by BrowserUk (Patriarch) on Nov 01, 2011 at 09:39 UTC
    I've wrapped its functionality into an OO-module for convenience:

    "I've wrapped your bicycle in tissue paper and a nice bow." -- but it sure ain't for "convenience" :)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I find it convenient to have a single transform sub that produces the key-value pair for the object to search+parse a hash-like dict file. Handling/closing of filehandle is really just cake icing.

      Search::Dict sets the filehandle position to the first line greater than or equal $key. This seems pretty raw to me (read: that I should probably write some wrapper that takes care of the edge cases). The OO stick is not always the first thing I reach for, in case you're wondering.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://935067]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-03-29 10:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found