Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

seek harsh critique

by coyocanid (Sexton)
on Dec 08, 2017 at 16:00 UTC ( #1205181=perlquestion: print w/replies, xml ) Need Help??

coyocanid has asked for the wisdom of the Perl Monks concerning the following question:

I've been working on a project for some years and want to release it to CPAN and I'm not yet happy with the POD I've written.

I'm looking for a harsh critique of the POD below. I want to make sure it communicates the purpose and use of the module. Critique of the module is also welcome.

NAME Data::ObjectStore - Fast and efficient store for structured data. DESCRIPTION Data::ObjectStore stores structured data on the file system in a d +irectory the programmer supplies. Any configuration of interconnected hashe +s, arrays and Data::ObjectStore::Container objects may be placed in t +he store. The store keeps track of changes to any data structure reference p +laced within it. A single save operation writes all changes to the data structure or its fields that were made since the last save operati +on. Items load from the store on an as needed basis. Very large hashes + and arrays are transparently chunked behind a tied interface. A single + root container is the point of entry to add items. If a path of referen +ce can trace from this root container to the itemn, the item can be refer +enced from the store. LIMITATIONS This is not thread safe. It is designed currently for a single process/thread. Locking can be done on a level above Data::ObjectS +tore. SYNOPSIS # SETTING DATA use Data::ObjectStore; my $store = Data::ObjectStore::open_store( '/path/to/data-directo +ry' ); my $root_container = $store->load_root_container; my $current_data = [ 123,123,345,45,3752,346,326,43745,76 ]; my $experiment = { project_name => "Cool project", data_collected => { 20170401 => [ 34234,22456,345,34,123,26,4576 ], 20170402 => [ 67,53,3,435,63,5624,23,543276 ], 20170403 => $current_data, }, today_data => $current_data, notes => "It looks like theory X fits the data the best", }; # this adds the data to the store and only needs to be done # once. Any further changes to the data structure are monitored # by the store and saved upon call to save. $root_container->set( "experiment", $experiment ); # save _all_ data connected to the root container $store->save; # add a data point to the current data. # the reference to current_data is already in the store and # does not need to be re-added. push @$current_data, 234; # saves current_data with the new point; it is the only thing cha +nged # since the last save. $store->save; .... # GETTING DATA my $store = Data::ObjectStore::open_store( '/path/to/data-directo +ry' ); # loads the root container ( but not yet referenced contents ) my $root = $store->load_root_container; # loads the content referenced by the experiment key. my $experiment_again = $root->get( "experiment" ); # loads the data collected hash and a particular day my $day_values = $experiment_again->{data_collected}->{20170403}; my $same_ref = $experiment_again->{today_data}; # $day_values and $same_ref both reference the same array as per # the SETTING DATA example. print scalar( @$same_ref ); # prints '10' # add one more point to day values. push @$day_values, 345345; print scalar( @$same_ref ); # prints '11' # saves $day_values with the new point; its the only thing change +d # since the last save. $store->save; .... # CONTAINERS AND SYNTACTIC SUGAR my $store = Data::ObjectStore::open_store( '/path/to/data-directo +ry' ); my $root = $store->load_root_container; # generates a new container. # this must connect to the root in order to be accessible. my $user = $store->create_container( { name => "Edna", hashedPass => "weroiusafda89", title => "director", experiments => [], } ); # Makes an array called 'users' (if not already there), attached +it to # the root and adds $user to it as long as the user hasn't been a +dded. # Now $user is connected to the root via the users array. $root->add_once_to_users( $user ); # $root->get_experiment is the same as $root->get( 'experiment' ) +; my $experiments = $root->get_experiment; $user->add_to_exeriments( $experiments ); # more syntactic sugar for lists attached to a container. # $user->set_foods is the same as $user->set( 'foods' ); my $foods = $user->set_foods( [] ); push @$foods, "PEAS", "PEAS", "PEAS", "ORANGES"; $user->remove_from_foods( "PEAS" ); # $foods now "PEAS", "PEAS", "ORANGES" $user->remove_all_from_foods( "PEAS" ); # $foods now "ORANGES" $val = $user->get_status; # $val is undef $val = $user->get_status( "default" ); # $val eq 'default' $val = $user->get_status( "Something Else" ); # $val eq 'default' (old value not overridden by a new default va +lue) $val = $user->set_status( "A NEW STATUS" ); # $val eq 'A NEW STATUS' .... # EXTENDING CONTAINERS package Mad::Science::User; use Data::ObjectStore; use base 'Data::ObjectStore::Container'; # called when an object is newly created sub _init { my $self = shift; $self->set_status( "NEWLY CREATED" ); $self->set_experiments([]); } # called when the object is loaded from the store sub _load { my $self = shift; print "Loaded " . $self->get_name . " from store"; if( @{$self->get_experiments} > 0 ) { $self->set_status( "DOING EXPERIMENTS" ); } } sub evacuate { my $self = shift; $self->set_status( "HEADING FOR THE HILLS" ); } package main; my $store = Data::ObjectStore::open_store( '/path/to/data-directo +ry' ); # creates Mad::Science::User object # calls Mad::Science::User::_init on it. my $user = $store->create_container( 'Mad::Science::User', { name => 'Larry' } ); $store->set_larry( $user ); print $user->get_status; # prints 'NEWLY CREATED' $user->add_to_experiments( "NEW EXPERIMENT" ); $reopened_store = Data::ObjectStore::open_store( '/path/to/data-d +irectory' ); # calls Mad::Science::User::_load my $larry = $store->get_larry; print $larry->get_status; #prints 'DOING EXPERIMENTS' SIMPLE USE CASE The simple use case is using structured data containing only scala +rs, array refs and hash refs. The arrays and hashes may contain refere +nces to other arrays and hashes or scalar data. Circular references are al +lowed. Blessed or tied or array and hash references are not allowed. When the data structures are attached to the root container of the + store, they become tied to the store but continue to otherwise behave as +before. To attach : $my_data_structure = { lol => [ [...], [...] ], loh => [ {...}, {...} ], hol => { a => [...], b => [...] }, hoh => { a => {...}, b => {...} }, text => "Here is some text", numbers => 34535, }; $root = $store->load_root_container; $root->set( "my key", $my_data_structure ); To load : $root = $store->load_root_container; $my_data_structure = $root->get( "my key" ); When accessing the data structures from the root container, they a +re loaded into memory on as as needed basis. The arrays and hashes ma +y be of any size; huge ones are internally sharded and loaded piecemeal in +to memory as needed. This is all transparent to the programmer; these + arrays and hashes behave as normal arrays and hashes. The data is saved with a call to the save method on the store. The + data structure only needs to be added once to the store. CONTAINERS Containers are made by calling the create_container method of the +store. This takes an optional class name and an optional hash reference. +If a hash reference is given, the container data is populated from the +hash. If a class name is given and is a child class of Data::ObjectStore::C +ontainer that will be used. PUBLIC METHODS store returns the object store instance that created this object. set( field, value ) takes a field name and a value, sets the field of the container to + that value, and returns the value. get( field, defaultvalue ) takes a field name and an optional default value. If the field is undefined, it sets it to the default value. It returns the value associated with the field. SYNTACTIC SUGAR AUTOLOAD is used to provide convenience methods for the container +objects. get_foo( default ) same as get( 'foo', default ) set_foo( value ) same as set( 'foo', value ) add_to_bar( @list_to_add ) If the bar field does not exist in the container, it is created as + a list. If the bar field exists and is not a list, this causes an exceptio +n. Adds the given items to the end of the list. add_once_to_bar( @list_to_add ) Same as add_to_bar, except that it will not add to the list if the + item is already in the list. remove_from_bar( @things_to_remove_the_first_instances_of ) If the bar field does not exist in the container, it is created as + a list. If the bar field exists and is not a list, this causes an exceptio +n. Removes the first instance of the given items from the list. remove_all_from_bar( @things_to_remove_entirely ) Same as from_from_bar, except it removes all instances of the give +n items from the list. EXTENDING CONTAINERS Data::ObjectStore::Container is useful as is, but is also designed + to be extended. The ObjectStore tracks what reference types the objects +are that are stored inside it and blesses them appropriately. Two methods can be overridden : _init and _load. _init is called w +hen the object is created with create_container. _load is called each time + the object is loaded from the object store. EXAMPLE For example, for a hearts game, we can create a Hearts::Game class + and start to flesh it out. This implements a few methods that would be + needed for the hearts game. package Hearts::Game; use Data::ObjectStore; use base 'Data::ObjectStore::Container'; # called the first time this Hearts::Game object is created. sub _init { my $self = shift; my( @cards ); for $suit ('Spades','Clubs','Hearts','Diamonds' ) { push @cards, map { "$_ of $suit" } ('Ace',2..10,'Jack','Quee +n','King'); } $self->set_deck( [ sort { rand + 0*$a <=> rand + 0*$b } @cards +] ), # shuffled deck $self->set_players([]); $self->set_state( "waiting" ); } # called every time this is loaded from the data store. sub _load { my $self = shift; print "game loaded at state : " . $self->get_state; } sub deal { my $self = shift; die "Game not started" if $self->get_state eq 'waiting'; my $players = $self->get_players my $deck = $self->get_deck; #each player gets 13 cards for( 1..13 ) { for my $i ( 0..3 ) { my $card = shift @$deck; $players->[$i]->add_to_hand( $card ); } } } sub add_player { my($self,$player) = @_; $self->add_to_players( $player ); if( @{$self->get_players} == 4 ) { $self->set_state('playing'); $self->deal; } } PUBLIC METHODS open_store( '/path/to/directory' ) Starts up a persistance engine and returns it. load_root_container Fetches the user facing root node. This node is attached to the st +ore info node as 'root'. info Returns a hash of info about this opened data store. Updating the +hash has no effect. * db_version * ObjectStore_version * created_time * last_update_time get_db_version Returns the version of Data::RecordStore that this was created und +er. get_store_version Returns the version of Data::ObjectStore that this was created und +er. get_created_time Returns the time this store was created. get_last_update_time Returns the time this store was last updated. create_container( optionalClass, { data } ) Returns a new Data::ObjectStore::Container container object or a s +ubclass, depending if the optional class parameter is supplied. If provided + with data, the object is initialized with the data. If the object is attached to the root or a container that is ultim +ately attached to the root, it will be saved when save is called. run_recycler This checks for objects that are not traceable to the root object +and removes them, recycling thier ids. save When called, this stores all objects that have been changed since +the last time save was called. NAME Data::ObjectStore::Container - Persistent Perl container object. SYNOPSIS $obj_A = $store->create_container; $obj_B = $store->create_container( myfoo => "This foo is mine", mylist => [ "A", "B", "C" ], myhash => { peanut => "Butter" } ); $obj_C = $store->create_container; # # get operations # print $obj_B->get_myfoo; # prints "this foo is mine" print $obj_B->get( "myfoo" ); # prints "this foo is mine" print $obj_B->get_myhash->{peanut}; # prints 'Butter' $val = $obj_A->get_unsetVal; # $val is now undef $val = $obj_A->get_unsetVal("default"); # $val is now 'default' $val = $obj_A->get_unsetVal("otherdefault"); # $val is still 'def +ault' $val = $obj_A->set_arbitraryfield( "SOMEVALUE" ); # $val is 'SOME +VALUE' # # set operations # $obj_C->set( "MYSET", "MYVAL" ); $val = $obj_C->get_MYSET; # $val is 'MYVAL' $obj_B->set_A( $obj_A ); $root = $store->load_root_container; $root->set_B( $obj_B ); # # list operations # $mylist = $obj_B->add_to_mylist( "D" ); #mylist now 'A','B','C',' +D' $newlist = $obj_B->add_to_newlist( 1, 2, 3, 3, 3 ); print join(",", $newlist); # prints 1,2,3,3,3 $obj_B->remove_from_newlist( 3 ); print join(",", $newlist); # prints 1,2,3,3 $obj_B->remove_all_from_newlist( 3 ); print join(",", $newlist); # prints 1,2 # yes, the $newlist reference is changed when the object is opera +ted on with list operations DESCRIPTION This is a container object that can be used to store key value dat +a where the keys are strings and the values can be hashes, arrays or Data::ObjectStore::Container objects. Any instances that can trace + a reference path to the store's root node are reachable upon reload. This class is designed to be overridden. Two methods are provided +for convenience. _init is run the first time the object is created. _l +oad is run each time the object is loaded from the data store. These meth +ods are no-ops in the base class. set( field, value ) Sets the field to the given value and returns the value. The value + may be a Data::ObjectStore::Container or subclass, or a hash or array ref +erence. get( field, default_value ) Returns the value associated with the given field. If the value is + not defined and a default value is given, this sets the value to the g +iven default value and returns it. The value may be a Data::ObjectStore::Container or subclass, or a +hash or array reference. store Returns the Data::ObjectStore that created this object. _init This is called the first time an object is created. It is not called when the object is loaded from storage. This can be use +d to set up defaults. This is meant to be overridden. _load This is called each time the object is loaded from the data st +ore. This is meant to be overridden. AUTHOR Eric Wolf COPYRIGHT AND LICENSE Copyright (c) 2017 Eric Wolf. All rights reserved. This program +is free software; you can redistribute it and/or modify it under the +same terms as Perl itself. VERSION Version 2.03 (September, 2017)

Replies are listed 'Best First'.
Re: seek harsh critique
by hippo (Bishop) on Dec 08, 2017 at 17:02 UTC

    You have called this "Data::ObjectStore" but it doesn't appear to store objects in the normal sense (instantiated class members against which methods can be called) but rather just data structures. I think this is potentially misleading.

    The other issue I have is that there is no rationale. What particular problem does this solve which isn't already met by other modules like Sereal or Storable? You say it is fast and efficient but compared to what?

      Thanks for the feedback. I have to rewrite and speak more as to what lead to the development of this.
Re: seek harsh critique
by toolic (Bishop) on Dec 08, 2017 at 16:24 UTC
    • You should enclose your code in "code" tags. This allows us to easily use the "download" link to copy-n-paste your code. It also prevents your code from being mangled: [$i] was linkified. Please edit your post.
    • You should enclose long code segments in "readmore" tags.
    • I find it difficult to follow because you seem to have interspersed rendered POD with code snippets. Or is the code part of the rendered POD? It would be easier to follow if you posted actual POD (=head1, etc.). What are the "...." for? If you post actual POD, we can run podchecker and see how it renders for us.
    • I ran spell checker: exeriments, persistance, sharded, thier
      Thanks for the info. Will run thru the podchecker and spellchecker.

        I have a pre-commit hook that runs (apart from normal test suite) some pod coverage, syntax, and spelling checks. The checks themselves live under xt/ directory so they don't get called when someone else installs the module, only when I'm committing. Quite annoying at first, but it's generally worth it.

Re: seek harsh critique
by Laurent_R (Canon) on Dec 08, 2017 at 17:33 UTC
    Hi coyocanid,

    I am not quite sure where the synopsis actually ends (perhaps because of the missing section corresponding to the ellipsis ...), but I have the feeling it might be too long for a synopsis, which is supposed to be a brief summary.

      I agree with the SYNOPSIS here.

      I prefer a nice short synopsis that has a couple of very short scenarios (although I am guilty of a few longer ones myself for various reasons). I would remove most of the #SECTION pieces out of synopsis and put them into an EXAMPLES section. If you do the POD correctly, your EXAMPLES section will be listed at the top of the page at MetaCPAN, and the sub-sections will be visibly listed, and a link generated to take you directly to that section:

      =head1 EXAMPLES =head2 Setting data ... =head2 Getting data ...
        Good ideas, thanks for the feedback.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1205181]
Approved by marto
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (1)
As of 2022-05-19 05:12 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (71 votes). Check out past polls.