http://qs321.pair.com?node_id=382174

For Lady_Aleena: Working example based on my understanding of what you were trying to do. Adding tied hashes, opening files, etc is left to you. Movies can be done the same was as for tv. The sub could be tightened up but I don't see a need for it. Hope this helps.

use strict; use warnings; use Data::Dumper; my @television_headings = qw{ title alt_title start_year end_year wikipedia allmovie imdb tvcom genre }; my @csv_television; # creating dummy data for the example foreach my $n ( 1 .. 3 ) { push( @csv_television, join( '|', map { $_ . $n } @television_head +ings ) ); } my $tv_hash = {}; load_h( $tv_hash, \@television_headings, \@csv_television ); print Dumper $tv_hash; sub load_h { my ( $h, $headings, $csv_data ) = @_; foreach my $line ( @$csv_data ) { my @vals = split( /\Q|\E/, $line ); # might want a real CSV pa +rser instead my %data; @data{ @$headings } = @vals; # hash slice $h->{ $vals[0] } = \%data; } }

The latest example I found of Really Ugly Code:

open(SHELL,">shell.sh"); print SHELL "cd $dir","\n"; print SHELL "cp $file $configParams{BASEDIR}"; system ("bash shell.sh") ; close(SHELL);


Quick links:

PerlMonks::Mechanized - revision ideas for davido
PerlMonks::Mechanized - comments to davido
PerlMonks::Mechanized test script
PerlMonks::Mechanized test script for retrieving all private msgs - for atcroft
PerlMonks::Mechanized::private_message
PerlMonks::Mechanized::Data - proposed
Sudoku solver


Revision of PerlMonks::Mechanized

Just a bunch of ideas rolling around my head - please let me know what you think. Thanks!

# my original idea was to have a getter/setter for each "extended info +" item: sub pass_ticker_id{ my( $self, $value ) = @_; if( defined $value ){ $self->{pass_ticker_id} = $value ? 1 : 0; } return $self->{pass_ticker_id}; } # we could lump them together: sub pass_extended_info{ my( $self, $value ) = @_; if( defined $value ){ $self->{pass_ticker_id} = $value ? 1 : 0; $self->{kitchen_sink} = $value ? 1 : 0; } return $value; } # the ticker methods could then be something like: sub threaded_ids { my( $self, $base ) = @_; $self->{agent}->get( $self->{site} . "?node_id=180684;id=$base" ); $self->{agent}->success() or die "Unable to fetch thread ticker for id = $base.\n"; my $struct = XMLin( $self->{agent}->content(), ForceArray => 1, KeepRoot => 1 ); my $ref = $self->pass_extended_info ? { 'ticker_id' => 180684, 'data' => $struct } : $struct; return $ref; } # if you wanted to lump the extended info into a single hash: my $ref = $self->pass_extended_info ? { 'info' => { 'ticker_id' => 180684, 'kitchen_sink' => $self->{kitchen_sink} }, 'data' => $struct } : $struct; # if the extended_info part was sub-ified: sub threaded_ids { my( $self, $base ) = @_; $self->{agent}->get( $self->{site} . "?node_id=180684;id=$base" ); $self->{agent}->success() or die "Unable to fetch thread ticker for id = $base.\n"; my $struct = XMLin( $self->{agent}->content(), ForceArray => 1, KeepRoot => 1 ); return $self->_format_struct( '180684', $struct ); } sub _format_struct{ my( $self, $tickerid, $struct ) = @_; my $ref = $self->pass_extended_info ? { 'ticker_id' => $tickerid, 'data' => $struct } : $struct; return $ref; } # or, to really make a mess of things: sub threaded_ids { my( $self, $base ) = @_; my $url = $self->{site} . "?node_id=180684;id=$base"; $self->_get_ticker( $url ) or return; return $self->_parse_struct( 180684 ); } sub _get_ticker{ my( $self, $url ) = @_; $self->{agent}->get( $url ); $self->{agent}->success() or do{ warn "Unable to fetch ticker for $url\n"; return; }; } sub _parse_struct{ my( $self, $tickerid ) = @_; my $struct = XMLin( $self->{agent}->content(), ForceArray => 1, KeepRoot => 1 ); return $self->_format_struct( $tickerid, $struct ); }


PerlMonks::Mechanized (PerlMonks::Mechanized (beta))

davido:

I added this method to PM::Mech to obtain a list of all nodes created by a given user. It is based on the user_stats method.

# Grab info (title, date, reputation, etc) for a user's nodes. # Uses the user node info XML Ticker (id://32704). If the # "reputation" field is not required, foruser=userNameOrID can # be used, which avoids requiring a login. Otherwise returns # information about nodes by the logged-in user (or by Anonymous # Monk if no login). sub user_nodes { my( $self, %params ) = @_; my $parameters = ''; foreach( ( 'for_user', 'for_userid' ) ) { $parameters .= exists( $params{$_} ) ? ";$_=$params{$_}" : ''; } $self->{agent}->get( $self->{site} . "?node_id=32704;" . $parameters . $self->_login_URI() ); $self->{agent}->success() or die "Unable to fetch user nodes ticker.\n"; return XMLin( $self->{agent}->content(), ForceArray => 1 ); }
In addition, since the methods are so similar to each other I started pulling out common elements. I ended up with something that is probably much less readable, but I'd still like your thoughts on it as it gives me a better feel for what is Good Design. (I don't think the code below is any easier to read than the original, so I question if it was the right move.) Using the new method from above as an example (untested):
sub user_nodes { my( $self, %params ) = @_; my @allowed_params = ( 'for_user', 'for_userid' ); my $url = join( '', '?node_id=32704;', get_param_list( \@allowed_params, %params ), $self->_login_URI() ); get_ticker( $self, $url, 'user nodes' ); return XMLin( $self->{agent}->content(), ForceArray => 1 ); } sub get_param_list { my ( $ref2allowed, %params ) = @_; my $parameters = ''; foreach( @$ref2allowed ) { $parameters .= exists( $params{$_} ) ? ";$_=$params{$_}" : ''; } return $parameters; } sub get_ticker { my ( $self, $url, $tickertype ) = @_; $self->{agent}->get( $self->{site} . $url ); $self->{agent}->success() or die "Unable to fetch $tickertype ticker.\n"; }

I'm also kicking around the idea of adding PerlMonks::Mechanized::Data (or something similar). The idea would be to provide an OO interface to the data elements returned by Mech so the user wouldn't have to know anything about the data structure that is returned, and to insulate Mech users from future changes to the XML generators (either in format or content).

I thought one WTDI would be to change Mech to return ::Data objects, then call a parse method on them (which could be in the Data parent class and, at least initially, the parser could just pass the data to XML::Simple). Each ticker could have subclassed methods, if necessary, but some of them (e.g., those that deal with node writeups) could utilize the same methods. In the case of node writeups, a class object/datastructure could be used to store the data for future use. For a given node data object, accessor methods could include 'title', 'rep', 'author_id', 'author_name', 'date_created', etc. I'm very new to OO, so this design is probably not optimal.

Finally, thanks for writing PM::Mech. I tried it out (after upgrading WWW::Mech and friends) and it works great!


PerlMonks::Mechanized test script

use strict; use warnings; use Data::Dumper; use PerlMonks::Mechanized; # pass username and password to 'new' my $pm_obj = PerlMonks::Mechanized->new( 'xxxx', 'xxxx' ); my $root_node_id = 466016; # node used for example my @node_ids = ( $root_node_id, $root_node_id + 1 ); #*********************************************************** print "PM object:\n"; print Dumper( $pm_obj ); { print "user_stats:\n"; my $data = $pm_obj->user_stats( showall => 1 ); print Dumper( $data ); } print "PM object:\n"; print Dumper( $pm_obj ); foreach my $method qw( threaded_ids thread_list node_info node_content ) { print "$method:\n"; my $data = $pm_obj->$method( $root_node_id ); print Dumper( $data ); } foreach my $method qw( node_info node_titles ) { print "$method:\n"; my $data = $pm_obj->$method( @node_ids ); print Dumper( $data ); }

PerlMonks::Mechanized test script for retrieving all private msgs

atcroft:

use strict; use warnings; use PerlMonks::Mechanized; my $timeformat = '%Y-%m-%d %H:%M:%S'; # format for strftime # should put username and password into ENV vars instead my $pm_obj = PerlMonks::Mechanized->new( 'username', 'passwd' ); #*********************************************************** my @msgs; my $since_id = 0; # get all messages my $delay = 10; # num secs between page requests (init value) my $max_recs = 100; # max # records to return at a time (init value) while( 1 ) { my ( $data, $info ) = $pm_obj->private_message( archived => 'both', xmlstyle => 'clean', since_id => $since_id, max_recs => $max_recs, min_poll_seconds => $delay ); last if( not defined $data ); # update the max_recs and min_poll_seconds params # based on the values in INFO $delay = $info->[0]->{min_poll_seconds} || $delay; $max_recs = $info->[0]->{max_recs} || $max_recs; # save the msgs for processing (could just print now instead) push( @msgs, @{ $data } ); # msgs are returned in ascending id order, so the # last msg is the most recent $since_id = $data->[-1]->{message_id}; print 'retrieved ', scalar @{ $data }, ' messages'; if( scalar @{ $data } <= $max_recs ) { print "\n"; last; } print ", sleeping $delay secs\n"; sleep( $delay ); } print "\nprivate messages:\n"; foreach my $msg ( @msgs ) { my $datetime = format_datetime_string( $msg->{time}, $timeformat ); print "$datetime - $msg->{author}: $msg->{content}\n\n"; } #*********************************************************** sub format_datetime_string { my ( $string, $format ) = @_; # $string is of the format: YYYYMMDDhhmmss # YYYY = 4 digit year # MM = month, 1-12 # DD = day, 1-31 # hh = hour (24 hr scale, EST) # mm = min # ss = sec my $year = substr( $string, 0, 4 ); my $month = substr( $string, 4, 2 ); my $day = substr( $string, 6, 2 ); my $hour = substr( $string, 8, 2 ); my $min = substr( $string, 10, 2 ); my $sec = substr( $string, 12, 2 ); # strftime expects $month to be 0..11 and # $year to be num yrs since 1900 return POSIX::strftime( $format, $sec, $min, $hour, $day, $month-1, $year-1900 ); }

...and the updated code for PerlMonks::Mechanized::private_message

sub private_message { my( $self, %params ) = @_; my $parameters = ''; foreach( ( 'max_recs', 'since_id', 'prior_to', 'archived' ) ) { $parameters .= exists( $params{$_} ) ? ";$_=$params{$_}" : ''; } $self->{agent}->get( $self->{site} . "?node_id=15848;xmlstyle=clean" . $parameters . $self->_login_URI() ); $self->{agent}->success() or die "Unable to fetch Private Message ticker.\n"; # Return values modified by bobf 6-20-05. # Was just the {message} block, now is ( {message}, {info} ). # This allows the caller to obtain params from the INFO section, # including the max_recs and min_poll_seconds params set by PM. # This change is not necessary if the while() loop to get all # msgs is included in this method rather than in the caller. my $data = XMLin( $self->{agent}->content(), ForceArray => 1 ); return( $data->{message}, $data->{INFO} ); }

PerlMonks::Mechanized::Data - ideas and rough code

davido:

The following module was written as a first bash at standardizing the data returned by the PM::Mech methods. It is still very rough, poorly documented, and minimally tested, but it serves as an example for what I was starting to think about. First, some comments:

I envision being able to do something like this:

my $pm_obj = PerlMonks::Mechanized->new(); my $node_data = $pm_obj->node_info( @node_ids ); my $title = $node_data->title( $node_ids[3] );

With PM::Mech::Data, we can get this far:

my $pm_obj = PerlMonks::Mechanized->new(); my $node_data = $pm_obj->node_info( @node_ids ); add_node_data( $node_data, \%alldata ); my $title = $alldata{ $node_ids[3] }{title};

Example for PerlMonks::Mechanized::Data

my %nodedata; my $pm_obj = PerlMonks::Mechanized->new(); my $data = $pm_obj->node_info( @node_ids ); add_node_data( $data, \%nodedata );

And now for the code:

package PerlMonks::Mechanized::Data; # PerlMonks::Mechanized::Data standardizes the data structure returned # by the PM::Mech user_nodes, node_info, node_content, and node_titles # methods. The add_node_data routine takes as input the ref returned # from those methods (output from XML::Simple) and a ref # to a master hash. # The structure of the master hash is shown below. #$alldata_ref: # $node_id => # { # 'node_id' => '466017', # 'root_node' => '466016', # 'parent_node' => '466016', # # 'author_user' => '333489', # 'author_name' => 'muba', # # 'title' => 'Re: regex for word puzzle', # 'content' => (node text), # 'reputation' => '17', # 'nodetype' => 'note', # # 'createtime' => '20050612204931', # 'created' => '2005-06-12 20:49:31', # 'lastupdate' => '', # 'lastedit' => '20050407145724' # } use strict; use warnings; use Carp qw( carp ); use Data::Dumper; use Exporter; our @ISA = ("Exporter"); #our @EXPORT = (); our @EXPORT_OK = qw( add_node_data ); our $VERSION = 0.01; #********************************************************************* my %integrate = ( user_nodes => \&_add_user_nodes, node_info => \&_add_node_info, node_content => \&_add_node_content, node_titles => \&_add_node_titles ); sub add_node_data { my ( $newdata_ref, $alldata_ref ) = @_; # $newdata_ref = ref to the output from the PM::Mech methods # $alldata_ref = ref to the master hash containing all node data my $datatype = _determine_data_type( $newdata_ref ); if( not defined $datatype ) { return 1; } $integrate{$datatype}->( $newdata_ref, $alldata_ref ); return 0; } #********************************************************************* sub _determine_data_type { my ( $newdata_ref ) = @_; if( ref( $newdata_ref ) eq 'HASH' && exists $newdata_ref->{INFO} && exists $newdata_ref->{NODE} ) { return 'user_nodes'; } elsif( ref( $newdata_ref ) eq 'HASH' && exists $newdata_ref->{title} && exists $newdata_ref->{author} ) { return 'node_content'; } elsif( ref( $newdata_ref ) eq 'ARRAY' && ref( $newdata_ref->[0] ) eq 'HASH' ) { return 'node_info'; } elsif( ref( $newdata_ref ) eq 'ARRAY' && ref( $newdata_ref->[0] ) eq 'ARRAY' ) { return 'node_titles'; } else { Carp::carp "\nUnrecognized data type"; print "\n"; return undef; } } sub _add_node_info { my ( $newdata_ref, $alldata_ref ) = @_; foreach my $src_ref ( @{ $newdata_ref } ) { my $node_id = $src_ref->{node_id}; my $dest_ref = \%{ $alldata_ref->{$node_id} }; my %data = ( lastupdate => $src_ref->{lastupdate}, nodetype => $src_ref->{nodetype}, root_node => $src_ref->{root_node}, title => $src_ref->{content}, createtime => $src_ref->{createtime}, node_id => $src_ref->{node_id}, author_user => $src_ref->{author_user}, author_name => $src_ref->{author_name}, parent_node => $src_ref->{parent_node} ); foreach my $key qw( root_node parent_node ) { if( not defined $data{$key} ) { # The root and parent nodes are undef if this node is # not a reply, so skip them. # We could set root and parent = $node_id in %data if # undef, instead (but it may not be expected behavior) delete $data{$key}; } } foreach my $key ( keys %data ) { if( exists $dest_ref->{$key} && $data{$key} ne $dest_ref->{$key} ) { _print_warning( $node_id, $key, $dest_ref->{$key}, $data{$key} ); } $dest_ref->{$key} = $data{$key}; } } } sub _add_node_titles { my ( $newdata_ref, $alldata_ref ) = @_; foreach my $src_ref ( @{ $newdata_ref } ) { my $node_id = $src_ref->[0]; my $title = $src_ref->[1]; my $dest_ref = \%{ $alldata_ref->{$node_id} }; if( exists $dest_ref->{title} && $title ne $dest_ref->{title} ) { _print_warning( $node_id, 'title', $dest_ref->{title}, $title ); } $dest_ref->{title} = $title; } } sub _add_node_content { my ( $newdata_ref, $alldata_ref ) = @_; # is the data in 'updated' in the same format as for 'lastupdate'? # retain the 'created' key (in a diff format than 'createtime') my $node_id = $newdata_ref->{id}; my %data = ( title => $newdata_ref->{title}, lastupdate => $newdata_ref->{updated}, created => $newdata_ref->{created}, content => $newdata_ref->{doctext}{content}, nodetype => $newdata_ref->{type}{content}, author_name => $newdata_ref->{author}{content}, author_user => $newdata_ref->{author}{id} ); my $dest_ref = \%{ $alldata_ref->{$node_id} }; foreach my $key ( keys %data ) { if( exists $dest_ref->{$key} && $data{$key} ne $dest_ref->{$key} ) { _print_warning( $node_id, $key, $dest_ref->{$key}, $data{$key} ); } $dest_ref->{$key} = $data{$key}; } } sub _add_user_nodes { my ( $newdata_ref, $alldata_ref ) = @_; my $author_name = $newdata_ref->{INFO}->[0]->{foruser}; while( my( $node_id, $noderef ) = each %{ $newdata_ref->{NODE} } ) { my %data = ( reputation => $noderef->{reputation}, created => $noderef->{createtime}, title => $noderef->{content}, lastupdate => $noderef->{lastupdate}, lastedit => $noderef->{lastedit} ); my $dest_ref = \%{ $alldata_ref->{$node_id} }; foreach my $key ( keys %data ) { if( exists $dest_ref->{$key} && $data{$key} ne $dest_ref->{$key} ) { _print_warning( $node_id, $key, $dest_ref->{$key}, $data{$key} ); } $dest_ref->{$key} = $data{$key}; } } } sub _print_warning { my ( $node_id, $key, $dest_val, $src_val ) = @_; print "Warning - data discrepancy for node ID $node_id:\n"; print " current $key = $dest_val\n"; print " new $key = $src_val\n"; print " The new data will replace the current data\n"; }

Sudoku solver (my own feeble attempt)

# Each row must contain the digits 1 through 9 in any order. # Each column must contain the digits 1 through 9 in any order. # The 9x9 grid holds nine 3x3 grids. Each of those 3x3 grids must # contain the digits 1 through 9 in any order. use strict; use warnings; use Data::Dumper; use POSIX qw( ceil ); use List::Compare; # $solution[$row][$col], 0 = unknown # a dummy row and col will be added to @solution to allow indeces 1..9 my @solution = ( [ qw( 8 5 0 1 0 2 7 0 3 ) ], [ qw( 3 0 0 0 4 0 0 0 0 ) ], [ qw( 0 0 4 7 3 0 0 0 0 ) ], [ qw( 4 0 0 0 0 0 8 5 0 ) ], [ qw( 0 2 0 0 0 0 0 1 0 ) ], [ qw( 0 1 5 0 0 0 0 0 4 ) ], [ qw( 0 0 0 0 1 7 4 0 5 ) ], [ qw( 0 0 0 0 2 0 0 0 1 ) ], [ qw( 7 0 0 9 0 5 0 2 6 ) ] ); unshift( @solution, [] ); foreach my $row ( 1..9 ) { unshift( @{ $solution[$row] }, 0 ); } # the 3x3 grids are arranged into a 9x9 cell table as follows # 1 2 3 # 4 5 6 # 7 8 9 # the 9x9 cell table has rows 1..9 and cols 1..9 # translate row and col indeces (concatenated) of the upper left cell +in each 3x3 grid into an index my %gridnum = ( 11 => 1, 12 => 2, 13 => 3, 21 => 4, 22 => 5, 23 => 6, 31 => 7, 32 => 8, 33 => 9 ); # $grid[1..9] = [ options left for this 3x3 grid ] my @grids; initialize_grids(); # $rows[1..9] = [ options left for this row ] # $cols[1..9] = [ options left for this col ] my @rows; my @cols; initialize_rows(); initialize_cols(); my $unsolved = ( 9 * 9 ) - num_hints(); while( $unsolved ) { foreach my $row ( 1..9 ) { foreach my $col ( 1..9 ) { next if $solution[$row][$col]; # find intersection of @row and @col for this cell, and in +tersection with @grid = options left for this 3x3 grid # if only 1 left, assign to @solution and subtract from @r +ow and @col and @grid, $unsolved-- my $gridkey = join( '', POSIX::ceil( $row/3 ), POSIX::ceil +( $col/3 ) ); my $gridnum = $gridnum{$gridkey}; my $lc_obj = List::Compare->new( $rows[$row], $cols[$col], + $grids[$gridnum] ); my @options = $lc_obj->get_intersection(); if( scalar @options == 1 ) { $solution[$row][$col] = $options[0]; foreach my $a_ref ( $rows[$row], $cols[$col], $grids[$ +gridnum] ) { @{ $a_ref } = grep{ $_ != $options[0] } @{ $a_ref +}; } $unsolved--; } } } } foreach my $row ( 1..9 ) { print join( ' ', @{ $solution[$row] }[1..9] ), "\n"; } sub initialize_grids { # determine what numbers are available as options for each 3x3 gri +d # initialize each 3x3 grid foreach my $gridnum ( 1..9 ) { @{ $grids[$gridnum] } = ( 1..9 ); } # filter out hints already in @solution foreach my $row ( 1..9 ) { foreach my $col ( 1..9 ) { my $gridkey = join( '', POSIX::ceil( $row/3 ), POSIX::ceil +( $col/3 ) ); my $gridnum = $gridnum{$gridkey}; @{ $grids[$gridnum] } = grep{ $_ != $solution[$row][$col] +} @{ $grids[$gridnum] }; } } } sub initialize_rows { # determine what numbers are available as options for each row foreach my $row ( 1..9 ) { @{ $rows[$row] } = ( 1..9 ); foreach my $col ( 1..9 ) { @{ $rows[$row] } = grep{ $_ != $solution[$row][$col] } @{ +$rows[$row] }; } } } sub initialize_cols { # determine what numbers are available as options for each col foreach my $col ( 1..9 ) { @{ $cols[$col] } = ( 1..9 ); foreach my $row ( 1..9 ) { @{ $cols[$col] } = grep{ $_ != $solution[$row][$col] } @{ +$cols[$col] }; } } } sub num_hints { my $hints = 0; foreach my $row( 1..9 ) { $hints += scalar grep{ $_ != 0 } @{ $solution[$row] }; } return $hints; }