Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

%hash (@array && use constant) in Modules

by abaxaba (Hermit)
on Apr 23, 2002 at 21:23 UTC ( [id://161446]=perlquestion: print w/replies, xml ) Need Help??

abaxaba has asked for the wisdom of the Perl Monks concerning the following question:

We got into a discussion during a recent 'mongers meeting about hashes, arrays, constants, speed, and autovivification. While this has been touched on recently, I continue the discussion. When constructing modules, I prefer to:
package Foo; use constant BAR =>1; use constant BAZ =>2; use constant SO_ON =>3; use constant SO_FORTH =>4; sub new { return bless ([],shift); } sub getBar { my $self=shift; $self->[BAR]; }
I picked up this style from Damien's OOP book, and like it perfectly, except for the non-automagic exportation of the use constant subs, esp. with @Bar::ISA=qw(FOO).

Part of it is laziness: typing "[ ]" as opposed to "{ }", which requires a shift key. Also, I was taught early in my career that array lookups are faster than perl's hashing and table-lookups. And I don't need to get into the autoviv. of hashes.

So I'm curious: What are you're preferences? Is the speed issue that much of an issue? And, more importantly, by using this approach, what sort of pitfalls should I be on the lookout for? Thanks!

Replies are listed 'Best First'.
Re: %hash (@array && use constant) in Modules
by Juerd (Abbot) on Apr 23, 2002 at 22:10 UTC

    except for the non-automagic exportation of the use constant subs

    Why would you want them exported? It's easier to write accessors, or use hash keys. To export all constants (well, to export all all-uppercase-subs), you could use:

    use Devel::GetSymbols; @EXPORT = grep /^[A-Z0-9_]+$/, Devel::GetSymbols::Subs;

    Part of it is laziness: typing " " as opposed to "{ }", which requires a shift key.

    I still find $foo{bar} easier to type than $foo[BAR]. Hashes are better for lazy people: you don't have to define a new constant every time you want to store more, and if you don't care much about style, you can skip some accessor methods too. Hashes dump better too, in case you need to debug.

    Also, I was taught early in my career that array lookups are faster than perl's hashing and table-lookups.

    It is true, but I still like saving minutes on coding better than saving a few nanoseconds on execution.

    And I don't need to get into the autoviv. of hashes.

    Exactly the same, but less inefficient if you have a lot of keys. If you set $foo[100], you'll have 100 undef values and the one you assigned to.

    What are you're preferences?

    Hashes, unless I need to save only one, two or three things, in which cases I use scalar or array references.

    Is the speed issue that much of an issue?

    Not to me. If speed were the issue, I wouldn't be coding OO anyway.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

Re: %hash (@array && use constant) in Modules
by Elian (Parson) on Apr 23, 2002 at 21:45 UTC
    While hashes are slower to look up than arrays are (~30% at one point), you'll probably find the cost of the method lookups by far swamps the speed loss of using a hash instead of an array for the object itself.
Re: %hash (@array && use constant) in Modules
by tadman (Prior) on Apr 23, 2002 at 21:50 UTC
    Array versus hash isn't an issue if you ask me. The penalty you pay for constants is probably larger than those for a hash lookup. Constants are really subroutines, and subroutine calls use the stack.

    The advantage of array-based objects, as far as I can tell, is reduced memory usage. You are only storing data and not hash-keys. They are faster, in theory, it would seem, but not in practice. If you used Filter instead of constant, then I might agree. A small compile-time penalty for a large run-time improvement. Another alternative, although a bit bold, is to use pseudo-hashes.

    Hash-based structures are fully extensible, and can be sub-classed easily. I don't think the same holds true for the array-based ones. How do you know where the parent left off, index wise? Do you have to keep track with another constant?

    All of this is speculation, of course, until some testing is done to see if it is true.

    One thing I would like to do is take something like Autoload::Hash and extend it into a module which could fully realize the benefits of the array-based object (speed, memory) without requiring more code maintenance. Automatic code generation means referential integrity.
      Array versus hash isn't an issue if you ask me. The penalty you pay for constants is probably larger than those for a hash lookup. Constants are really subroutines, and subroutine calls use the stack.
      Constant subs get (or at least should get) their values inlined at compile time. No runtime penalty there.
        "Should get" versus "do get" is an important distinction. Have a look inside constant and see:
        { no strict 'refs'; my $full_name = "${pkg}::$name"; $declared{$full_name}++; if (@_ == 1) { my $scalar = $_[0]; *$full_name = sub () { $scalar }; } elsif (@_) { my @list = @_; *$full_name = sub () { @list }; } else { *$full_name = sub () { }; } }
        Looks like a regular subroutine for me. No inline. Maybe in Perl 6.
Re: %hash (@array && use constant) in Modules
by Anonymous Monk on Apr 23, 2002 at 22:00 UTC

    ...I'm with the prevailing opinion on this issue.

    There are really two types of efficiency. Runtime efficiency, and development-time efficiency. And you don't have to guess to know which one is worth more in all but a very few cases.

    So except for high-repetition operations, and high volume operations, I opt for the data structures that support maximum exposure of the program logic. I usually find that a hash does this for me. (IMHO)

    ---v

Re: %hash (@array && use constant) in Modules
by Anonymous Monk on Apr 24, 2002 at 10:30 UTC
    Nobody seems to have touched on inheritability yet, which is more than a little important to my way of thinking. Admittedly, we're already on dangerous ground in Perl because descendent classes need to be aware of the basic representation of the data (unless they're not going to be adding any instance variables of their own and can just use the parent accessor methods).

    However, arrays are much harder to subclass. For a start, when you make a subclass you have to know what the 'last' number used by the parent class was.

    package FooChild; use base 'Foo'; use constant SO_FIFTH => 5; sub getFifth { my $self = shift; $self->[SO_FIFTH]; }
    Which wouldn't be a problem, but what happens when you discover that you need to add an instance variable in the parent class. How do you pick a new number? You can just use the next number in the sequence, but then you have to go and modify every single child class to take this into account. Or you can pick a number that no child class has used, but then you have to remember that this number is off limits to the child class. And it all gets horrible, dependencies proliferate like bunnies and you end up with code that you don't dare change for fear of breaking something else way over there. Which is one of the problems that OO is supposed to fix.

    With a hash, the workaround is relatively simple. Just pick a unique name for your hash key. Most of the time you can get away with just being careful. But if you're being paranoid you might want to always choose keys of the form 'Package::key', or use a multidimensional hash with accessors of the form

    sub getFoo { my $self = shift; $self->{__PACKAGE__}{Foo} }
    (If you do end up doing that it's generally a good idea to write a tool for rolling accessor methods)
      Don't you just hate it when you realise, after hitting submit that you weren't logged on?
Re: %hash (@array && use constant) in Modules
by moodster (Hermit) on Apr 24, 2002 at 07:30 UTC
    Hashes may not be the fastest way to access data, but they are fast enough for most purposes. I used to be very concerned about performance and speed hits when I started doing perl (it probably comes from when I was trying to do graphics programming for the 286 and every clock cycle counted and certain operations, like division, were feared like the plague). Nowadays I'm more relaxed.

    Perl hashes usually gets the job done with the least amount of effort. I use them where I'd use C structs, Java Objects, Vectors, Sets or whatever. They are extremely versatile beasts. It's only when I have ordered data that I have the need for arrays, really. Which leads me to the point I'm trying to make:

    For most classes, the members aren't ordered among themselves. In your example, the foo and bar member variables have no real relation to each other and aren't ordered. Since they only exist to hold named pieces of data it makes perfect sense to me to store them in a hash. Of course, there are classes where it'd be more intuitive to use an array but I can't think of a good examle right now. And anyway, if I needed a class to store ordered data, I'd still use a hash and let one of the member variables be a reference to an anonymous array.

    Cheers,
    --Moodster

Re: %hash (@array && use constant) in Modules
by Hrunting (Pilgrim) on Apr 24, 2002 at 17:56 UTC
    Everyone's comments are right. If you want to eek out every possible ounce of performance, arrays are faster, but for 99.99% of the jobs, hashes will do more than what you need, and they have other benefits in terms of object inheritance.

    What I'm surprised no one has touched upon is:
    Part of it is laziness: typing "[ ]" as opposed to "{ }", which requires a shift key.

    Can you explain to me exactly how you're typing '[BAR]' without using the shift key? I suppose you could hit the caps lock key, but again, no decrease in the amount of work done.

    As an aside, I did implement a class using pseudo-hashes (still use it in fact, very robust). While the whole idea may be very bad, it's worked out well for us. Our performance bottlenecks were elsewhere, though.

      I suppose, Hunter, that you are correct, regarding the shift keys. Typing BAR does require using shift. I was thinking more in terms of {bar} as opposed to [bar], I guess.

      Maybe I just hate typing curly brackets! :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://161446]
Approved by ehdonhon
Front-paged by ehdonhon
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2024-03-28 10:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found