Data Structure Design

Tuna has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
(Ovid - plan for the future) Re: Data Structure Design by Ovid (Cardinal) on Aug 17, 2001 at 05:03 UTC
Since you need to do these in order, I'd probably start with an array. Further, I like to try to make things flexible so that I can easily adjust the behavior in the future. The following quick hack will iterate over the clusters. It defines the total number of machines per cluster (in case they change), and if any machine in the cluster has a different behavior, allows you to assign a subref to that machine. In the future, you can change the number of machines and change the behavior for any machine for any cluster to anything you want. Further, the data structure is very intuitive. This should be simple to maintain. Note how easy it is to change the behavior for the fourth cluster. use strict; use warnings; my @clusters = ( { total => 7, 7 => \&two }, { total => 7, 7 => \&two }, { total => 7, 7 => \&two }, { total => 16, 7 => \&two, 4 => \&three }, { total => 7, 7 => \&two }, { total => 7, 7 => \&two }, { total => 7, 7 => \&two } ); foreach my $cluster ( @clusters ) { # decrement count by one to account for arrays starting at zero my $machine_count = $cluster->{ total } - 1; for my $machine ( 0 .. $machine_count ) { if ( ! exists $cluster->{ $machine + 1 } ) { &one; } else { &{ $cluster->{ $machine + 1 } }; } } } sub one { print "First sub\n"; } sub two { print "Second sub\n"; } sub three { print "Third sub\n"; } [download] You'll probably want to play with that for loop as the increments and decrements look weird, but it's a start. Cheers, Ovid Vote for paco! Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l]
Re: Data Structure Design by kjherron (Pilgrim) on Aug 17, 2001 at 08:11 UTC
First of all, let me commend you for being concerned with your data structures. Good data structures can make a program much simpler to understand and extend. Anyway, I'd be inclined to do something like this: `my @clusters = ( { Name => 'cluster1', Workers => [ qw(h1 h2 h3 h4 h5 h6) ], Queen => 'h7' }, { ... } ); foreach my $cluster (@clusters) { foreach my $worker (@{$cluster->{Workers}}) { handle_worker($worker, ...); } handle_queen($cluster->{Queen}, ...); }` [download] This is obviously very abstract. If you want to do all the workers in parallel you could. I also don't know if "workers" and "queen" correctly describes the relationship between these servers; you could probably think of better terms. But this kind of arrangement gives you a lot of flexibility. For example, if another kind of server gets added to the mix later, it's trivial to add processing for it.	[reply] [d/l]
Re: Data Structure Design by jepri (Parson) on Aug 17, 2001 at 04:54 UTC
`foreach my $num (1..5) { foreach my $letters ( ('TS','SA') ) { my $name = "cluster".$num."_".$letters; foreach ( @{$cluster{$name}}) { do_action($_); } } }` [download] I feel that's missing something, but I can't think what.. Update: It's missing a 2-D array. Double Update: Looking at Ovid solution, another idea presents itself - if all the clusters have the same number of computers, just do a modulo on the array index to find the one you have to treat specially. Consider: `my @clusters = ( ['h1', 'h2', ... ], ['h8', 'h9', ... ], ['h16', 'h17', ... ], ['h24', 'h25', ... ], ['h32', 'h33', ... ], );` [download] and now iterate over the array. Or even better `my @clusters = ( 'h1', 'h2', ... , 'h8', 'h9', ... , 'h16', 'h17', ... , 'h24', 'h25', ... , 'h32', 'h33', ... , );` [download] Of course you have to go to the effort of making each of the 'h's do something. It depends on the rest of your code how you want to attack it. ____________________ Jeremy I didn't believe in evil until I dated it.	[reply] [d/l] [select]
Re: Data Structure Design by Tuna (Friar) on Aug 17, 2001 at 05:24 UTC
Follow up: Here's an example of the data, and routines that I am working with: `$config_files{ $cluster } = [ $config_file ]; This contains: Key = Cluster_1 Value: remap.config, origin.db, local_cluster.db remap.config.hosts, origin.db.hosts, local_cluster.db.hosts` [download] I get my hostnames by parsing the above ".hosts" files. "TS" hostnames are contained in remap.config.hosts. "SA" hostnames (zone files) are contained in the two other ".hosts" files. TS servers must be reactivated first, then we attempt to reactivate the SA server. Then, we move on to the next cluster, assuming that the first succeeds. So, ultimately, what I need to do, is foreach host, in order: #####################7 # # # PROCESS SEQUENCES # # # ##################### @taskSequence = ( 'establishSession~~connecting to $clusterHost', 'local_checksum~~calculating local checksums', 'transferFilesToStage~~transferring files to $clusterHost: $cluster +StageDirectory', 'verifyPerms~~verifying file permissions', ); @activateSequence = ( 'activateTS~~$clusterHost: restarting traffic server with new confi +guration', 'activateSADNS~~$clusterHost: restarting SADNS with new configurati +on' ); @rollBackSequence = ( 'revertToExistingFiles~~$clusterHost: restarting using existing con +fig files', 'terminateSession~~$clusterHost: disconnecting from host' ); @offLineSequence = ( 'logBogusFiles~~$clusterHost: encountered corrupted files moved to +$errorLogDir', 'transferFilesToWorking~~$clusterHost: transferring previous files +to $clusterWorkingDirectory', 'takeClusterOffLine~~ $clusterHost is offline.'# < nsctl stop > ); @transferFile = ( 'copyFile~~copying file from remote host' ); #################################### # # MAIN # "foreach cluster" "foreach host, beginning with TA, then SA" @cleanUpTaskSequence = ('terminateSession~~disconnecting from host'); $executionStatus = &executeTaskSequence($clusterHost, @taskSequenc +e) ; $activationStatus = &executeTaskSequence($clusterHost, @activateSe +quence); $cleanUpStatus = &executeTaskSequence ($clusterHost, @cleanUpTaskS +equence); [download]	[reply] [d/l] [select]
Re: Re: Data Structure Design by tachyon (Chancellor) on Aug 17, 2001 at 08:44 UTC
Using your existing data structure (a hash of array refs) and assuming that an alphabetic key sort gives the desired activation order (OK for cluster1-9 but breaks at cluster 10) this is quick and dirty: my %config_files = ( cluster1 => ['1h1', 'h2', 'h3', 'h4', 'h5', 'h6', '1h7'], cluster2 => ['2h1', 'h2', 'h3', 'h4', 'h5', 'h6', '2h7'], cluster3 => ['3h1', 'h2', 'h3', 'h4', 'h5', 'h6', '3h7'], cluster4 => ['4h1', 'h2', 'h3', 'h4', 'h5', 'h6', '4h7'], cluster5 => ['5h1', 'h2', 'h3', 'h4', 'h5', 'h6', '5h7'], ); my @order = sort keys %config_files; print "Order will be @order\n"; for my $cluster (@order) { print "Cluster is $cluster\n"; my @machines = @{$config_files{$cluster}}; my $last = pop @machines; for my $box (@machines) { &start_up($box) or &failed($cluster, $box); } &special($last); } print "Done\n"; exit; sub start_up { my $box = shift; return 0 if $box eq 'h5'; # test failed routine # do stuff, return 1 for success 0 or undef for failure print "\tStarted $box\n"; 1; } sub special { # do box 7 stuff my $box = shift; print "\tSpecial $box\n"; 1; } sub failed { my ($cluster, $box) = @_; # do whatever you want like retry until sucess # then you return to loop and continue my $tries = 5; my $delay = 2; while ($tries) { $tries--; sleep $delay; print "Retrying $box in $cluster\n"; return if &start_up($box); } die "Unable to start $box in $cluster\n"; } [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]
Re: Data Structure Design by MZSanford (Curate) on Aug 17, 2001 at 13:04 UTC
in the spirit of KISS, here is my easy solution : `my %clusters = ( cluster1 => ['h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'h7'], cluster2 => ['h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'h7'], # ... ); foreach my $clust (sort keys %clusters) { foreach my $ind (0..5) { &process_1_6($clusters{$clust}[$ind]); } &process_7($clusters{$clust}[6]); }` [download] always more than one way to skin an amoebae -- MZS	[reply] [d/l]
Re: Data Structure Design by jwest (Friar) on Aug 17, 2001 at 17:50 UTC
Why not: `my @machines = qw(c1h1 c1h2 ... c5h7); for my $machine (@machines) { if ($machine =~ /h7$/) { process_special($machine) } else { process_normal($machine) } }` [download] This way, you know without even the slightest doubt precisely what order every machine will be processed in. And adding a new set of elements is no more (and arguably less) complicated than expanding the hash structure. Hope this helps! --jwest -><- -><- -><- -><- -><- All things are Perfect To every last Flaw And bound in accord With Eris's Law - HBT; The Book of Advice, 1:7	[reply] [d/l]
Here's what I did: by Tuna (Friar) on Aug 18, 2001 at 22:38 UTC
First, I can't say how amazing that all of you made time to answer my questions. Without you guys/gals, I probably wouldn't have a job, right now. Here's how I solved it: - Each config file has an accompanying ".hosts" file, whose contents are the hostnames of each machine in the cluster. - Hosts 1-6 are contained on config_file.hosts - Host 7 is contained in dnsfile_file.hosts I have two hashes that I refer to in the code. - %config_files - Keys = cluster directory, Value = an array ref to a list of each config file associated with that cluster. - %config_params - Keys = config_file, Value = an array ref to a list containing config_file, config_file.hosts, user\|preprocess.pl, user\|postprocess.pl, and a file "weight" foreach $componentDir ( keys %config_files ) { $abs_cluster_dir = join("/", "$configdir", "$componentDir"); @cluster_files = @{ $config_files{ $componentDir } }; $href_cluster_files = \@cluster_files; &process_cluster( $abs_cluster_dir, \@cluster_files, \%config_params, \$err_msg ); } sub process_cluster { # subroutine parameters my $cluster_dir = $_[ 0 ]; my $aref_cluster_files = $_[ 1 ]; my $href_config_params = $_[ 2 ]; my $sref_err_msg = $_[ 3 ]; my $clusterHost; my $config_file; my $counter; my $file; my @files_to_process; my $hosts_file; my @target_hosts; $counter = 0; foreach $file ( @{ $aref_cluster_files } ) { if (defined ( $href_config_params->{ $file } )) { $files_to_process[ $href_config_params->{ $file }[ 3 ]] = $f +ile; } } foreach $config_file ( @files_to_process ) { $hosts_file = new IO::File ( "$abs_cluster_dir/$config_file.host +s" );print "Hosts file = $hosts_file\n"; unless ( defined ($hosts_file )) { $$sref_err_msg = $!; return 0; } $counter = 0; while ( ! $hosts_file->eof() ) { $target_hosts[ $counter ] = ( $hosts_file->getline( ) ) ; chomp @target_hosts; $counter++; } $hosts_file->close( ); foreach $clusterHost ( @target_hosts ) { ####DO STUFF#### } } [download] The trick here, is to assign a numerical weight to each file that I need to process, so that I can populate an array of config files in the order that I need to process them!!! Again, thanks to you all, I began to consider alternatives. Steve	[reply] [d/l]
Re: Data Structure Design by Nitsuj (Hermit) on Aug 18, 2001 at 20:48 UTC
Make cluster a class with 7 elements (could also be classes containing info like hostnames, so forth). Or perhaps an array of the first 6 or a marker for the 7th or something. Then instantiate this class for all of the clusters, perhaps in an array of some sort. You could just walk down this structure executing your actions in a separate script taking the parameters specific to each node. Just Another Perl Backpacker	[reply]


The stupid question is the question not asked
	PerlMonks