comment on

Of course, behind the scenes, it is all hashes and arrays and hashes of hashes and ... .

And in one sentence, you've summed up the crux of my object(sic)ion to what you are proposing.

You're (simply and unnecessarily) wrapping a data structure in an (nest of) objects.

There are no methods, beyond getters and setters, which hashes and arrays already know how to do. In the process, you've hidden all the other things they know how to do. Like:

For the hashes: each, exists, defined, keys (in scalar, array and lvalue contexts). values (in scalar, array and lvalue) contexts, delete, undef.
For the arrays: pop, push, shift, splice, unshift, $#array, delete, undef; plus their various contextual variations.

Some of these can be exposed through the addition of generated methods, but all of them? And at what cost? Different names to the ones every half-experienced Perl programmer is already familiar with. get_size() and put_size() instead of $#array as an lvalue.

Or will that be get_length() and put_length() for this module? Or maybe get_count() and put_count(). Or ...

Throwing away the familiar, in order to substitute unfamiliar (and variable) naming conventions to do exactly similar things means that instead of the maintenance programmer being able to use the knowledge he has, he has to run away to the documentation--and even the source code--in order to understand the code he is reading.

I have no idea which of the 20 or so Moose modules I would need to look at in order to find out what the naming convention of the array length attribute of array based collections is. If it is exposed as standard and documented at all. Besides which, if I'm reading things correctly, the Moose user (Moose-based class writer) has the option of changing those names anyway. So that means I have to look it up for every class I use. And every method for every class. The costs of just the documentation lookup time in maintenance is amazing.

And of course it is much slower than programming these data structures directly,

And more memory expensive also. In my tests, an order of magnitude slower and requires 50 times more memory. But for the OPs current needs of 1000 lines, that is quite possibly insignificant, so let's not dwell on that. Let's look at what you consider this wrapping of native functionality buys you:

but then you don't get these nice accessors, mutators, type-checking, default values, ...

Hm. Let's compare syntax. You suggest that this is "nice":

$grid->get_line(257)->get_station('257_5')->Elevation;
[download]

The equivalent using native structures:

$grid->{ 257 }{ 257_5 }[ Elevation ];
[download]

And that's better because? Anticipating your answer to be along the lines of: "Because 'line' and 'station' are explicitly mentioned.", I'll counter with the fact that most accesses will not be in terms of constants, but rather variables for the indexing. So then you have to compare these two versions:

$grid->get_line($lineID)->get_station($stationID)->Elevation;
[download]

And

$grid->{ $lineId }{ $stationID }[ Elevation ];
[download]

Don't you find the verbosity and repetition of 'get_line'/$lineID & 'get_station'/$stationID distracting? Pointless? It doesn't look too bad with a single access expression as above, but what about when you come to do some real work with these things?

Syntax in use

In the OPs data structure there is another field between Northing and Elevation (which I called 'Other' because I have no idea what it is. See later) but just for the sake of example lets assume that it is wholly or partially derived from the 3D point (Easting, Northing and Elevation). And due to say, continental drift or more accurate GPS or some such, it is necessary to recalculate these values.

The Easting have to be updated by 0.001% West. The Northings by 0.0002% South. The elevations By +1 unit. And the Other field recalculated according to some formula.

The code using native data structures:

my $seismic = Seismic->new( 'seismic.dat' );

for my $lineID ( keys %{ $seismic } ) {

    for my $stn ( values %{ $seismic->{ $lineID } } ) {

        $stn->[ Easting   ] -= $stn->[ Easting  ] * 0.00001;
        $stn->[ Northing  ] -= $stn->[ Northing ] * 0.000002;
        $stn->[ Elevation ] += 1;
        
        $stn->[ Other ] = int(
            ( $stn->[ Easting   ]
            * $stn->[ Northing  ]
            * $stn->[ Elevation ]
            ) / 3.0
        );
    }
}
[download]

Using your objects

## You didn't provide a constructor from a file
## But you could have.
my $grid = Seismic::Grid->new( 'seismic.dat' );

for my $line_id ( $grid->line_ids ) {
    my $line = $grid->get_line( $line_id );
     
    for my $stn_id ( $line->station_ids } ) {
        my $station = $line->get_station($station_id);
        
        $station->set_Easting( 
            $station->get_Easting 
            - ( $station->get_Easting * 0.00001 ) 
        );
        $station->set_Northing( 
            $station->get_Northing 
            - ( $station->get_Northing * 0.000002 ) 
        );
        $station->set_Elevation( 
            $station->get_Elevation() - 1
        );

        ## You didn't provide this attribute, presumably 
        ## cos like me you didn't know what is was        
        ## but you could have
        $station->set_Other(
            int(
                ( $station->get_Easting()
                * $station->get_Northing()
                * $station->get_Elevation()
            ) / 3.0
        );
    }
}
[download]

So, what did what did OO buy you other than verbosity and complexity?

And before you answer, every time you go to start your reply with 'if', remember that any code written to cater for possibilities or eventualities that aren't in evidence from the OPs stated requirements, as well as being a potential solution to a potential problem, is also effort expended (money) that may never be used.

But it will still have to be tested and maintained. And when the future requirements of the application are in evidence, it may complicate the actual code needed to satisfy those actual requirements. Or worse, have to be thrown away completely because it is totally incompatible with them.

That's wasted development time, and testing time, and documentation effort, and interim maintenance effort on the basis of guesses about "what the future may hold".

For comparison

, here's my equivalent of your Seismic::Grid package posted above. Ie. The package named Seismic in my examples in this post.

package Seismic;
use Exporter;

use constant {
    Easting     => 0,
    Northing    => 1,
    Other       => 2,
    Elevation   => 3
};

our @ISA = qw[ Exporter ];
our @EXPORT = qw[ Easting Northing Other Elevation ];

sub new {
    my( $class, $filename ) = @_;
    my %lines;
    open my $in, '<', $filename or die "$filename : $!";
    while( <$in> ) {
        my( $line, $stn, $x, $y, $other, $z ) =
            unpack 'A8xA8xA8xA8xxA15xa4', $_;

        $lines{ $line }{ $stn } = [ $x,$y, $other, $z ];
    }
    close $in;
    return bless \%lines, $class;
}

1;
[download]

Note that this implements the only 'method' actually required by the OPs description--a from-file constructor. Notice how much less code there is than yours above whilst remembering that since the dawn of time (Okay, the software industry), there has been a direct (extra-linear) relationship between lines of code written and bugs found/maintenance required.

And note also, that this is just the number of lines you wrote. It doesn't include the whole of Moose::* and its dependencies: Class::MOP, ( and its dependencies: Sub::Name, MRO::Compat (and its: Class::C3 (and its: Algorithm::C3 ) ), Sub::Exporter (and its: Params::Util, Sub::Install, Data::Optlist ) ).

And I've left out all those that can generally be expected to be a part of the standard distribution (despite that it requires the latest cpan versions of most of them). Including Filter::Simple, which means source filters! (Though I had no luck in working out where in the pile of modules this is actually used?)

Now the typical reaction to this is "So what"...its all code I didn't have to write and don't have to maintain myself". But when something goes wrong you'll wish you could maintain it yourself.

Because when it turns out that the ref-count manipulations in lines 458 & 459 of Class::CS::XS are causing a memory leak, which manifests itself as coming from an anonymous sub generated by the string eval in Class::MOP::Method::Accessor, and you're urgently trying to get your application back on-line, and the authors of the dozen or so packages involved are arguing about who needs to change what.

At that point, you'll wonder about the efficacy of replacing Perl's reliable built-ins for such complexity, all for the sake a little syntactic sugar. Especially as it means you have to write more code to start with, the complexity of the code you write is increased, and the resultant code is if anything less readable than standard Perl.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^6: Data Structures by BrowserUk
in thread Data Structures by YYCseismic

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Keep It Simple, Stupid
	PerlMonks