Re: RFC on Inline::C hack: Hash

Replies are listed 'Best First'.
Re^2: RFC on Inline::C hack: Hash_Iterator by tlm (Prior) on Aug 01, 2005 at 12:55 UTC
...and where can I find the documentation to HeNEXT ? `HeNEXT` is a macro (surprise!), defined in hv.h as `#define HeNEXT(he) (he)->hent_next` [download] where `hent_next` is a field in the following struct (also from hv.h): `typedef struct he HE; /* ... / / entry in hash value chain / struct he { HE hent_next; /* next entry in chain / HEK hent_hek; /* hash key / SV hent_val; /* scalar value that was hashed */ };` [download] So basically `HeNEXT` scoots along a linked list of hash entries (those belonging to a given bucket, to be precise). As you can see, the data structure only provides for travel in one direction. Therefore, I don't see an easy way to implement a `prev` method. If I had to have it, I think I'd just accept the memory hit and use the `keys` array-based implementation I described briefly in my OP. Then `prev` reduces to decrementing a cursor variable. the lowliest monk	[reply] [d/l] [select]
Re^3: RFC on Inline::C hack: Hash_Iterator by Tanalis (Curate) on Aug 02, 2005 at 07:19 UTC
So basically HeNEXT scoots along a linked list of hash entries (those belonging to a given bucket, to be precise). Well.. you could always make use of a doubly-linked list to get around this. For the sake of the storage of a second pointer in your structure, this would seem a reasonable trade-off to give a more flexible iteration method. Just on a scalability note, and to raise awareness of a possible stumbling block if this code were to be used as anything other than an iterator, I'm always wary of a hash bucket containing a linked list. In this example, there's little to no issue - it's designed to be an iterator, and hence the list will always be traversed from start to end. For (most) implementations, however, where more random access of hashed elements is required, an iterative method is really quite inefficient, requiring up to time `const + N` to look up any entry. In these cases, it's often better to use something akin to a binary tree, offering at worst time `const + log N` to find any entry. For true flexibility, a linked list threaded through binary tree has been my structure of choice for awhile - offering a sane method of iterating through the structure while retaining a reasonable random-element-access time. Nice work - ++tlm. -- Foxcub `#include www.liquidfusion.org.uk`	[reply] [d/l] [select]
Re^4: RFC on Inline::C hack: Hash_Iterator by demerphq (Chancellor) on Aug 02, 2005 at 11:03 UTC
Well.. you could always make use of a doubly-linked list to get around this. For the sake of the storage of a second pointer in your structure, this would seem a reasonable trade-off to give a more flexible iteration method. The structures you are talking about are defined by Perl itself. Its not a design decision available to anybody but the pumpkings, and its unlikely they would appreciate the cost given the minimal benefits it would provide. For (most) implementations, however, where more random access of hashed elements is required, an iterative method is really quite inefficient, requiring up to time const + N to look up any entry. In these cases, it's often better to use something akin to a binary tree, offering at worst time const + log N to find any entry. Im not sure if I agree with this analysis. The linked lists used for buckets in perls hashes are intended to be extremely small, ie, generally they should hold only one element, and except for degenerate cases should not really exceed two elements. With this in mind a binary tree approach makes less sense as in most cases you will derive no benefit from it at all. Perls hashes dont allow for duplicate keys, which means that buckets only contain multiples when there are hash-key collisions. Such collisions should be unusual, and overly long bucket chains will be redistributed to other buckets when a resize event occurs, and iirc overly long bucket chains are precisely the determinant for such resize events. --- $world=~s/war/peace/g	[reply]
Re^5: RFC on Inline::C hack: Hash_Iterator by Tanalis (Curate) on Aug 02, 2005 at 11:51 UTC
Re^6: RFC on Inline::C hack: Hash_Iterator by demerphq (Chancellor) on Aug 02, 2005 at 12:58 UTC


We don't bite newbies here... much
	PerlMonks

Re: RFC on Inline::C hack: Hash_Iterator