Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Scraping Ebay

by lv211 (Beadle)
on Aug 31, 2006 at 05:34 UTC ( [id://570524]=perlquestion: print w/replies, xml ) Need Help??

lv211 has asked for the wisdom of the Perl Monks concerning the following question:

I got this script off the internet for scraping ebay and sending the results to an email addy. Every time I run the script I get the error "use of unitialized hash value".

This is in reference to the %data hash. I've tried playing around with it, but I can't seem to figure out what's wrong.

Here's the code:

use strict; use LWP 5.64; use URI; use HTML::LinkExtor; use HTML::HeadParser; use Net::SMTP; use MIME::Lite; # Set too your country e.g. ebaycom.au my $country = ".com"; my $base = "http://search.ebay".$country."/ws/search/SaleSearch"; # Title to search for my $title = "learning perl"; # Catergory to search get from http://listings.ebay.co.uk my $cat = "267"; #Books #your email address my $email = qw/emailaddy/; #your mail server my $mailsrv = qw/mailserver/; # File to keep items number already seen my $localfile = "listing.txt"; # declare some vars my ($a,$b,$line, $itemnumber,@title,$results,%data,%olditems,$key); #Set hash to nothing %data = (); my $browser = LWP::UserAgent->new; # Un comment if you need to use a proxy - replace with real address an +d port #$browser->proxy(['http', 'ftp'], 'http://10.111.10.11:8080/'); my $url =URI->new($base); $url->query_form( 'satitle'=> $title, 'sacat'=> $cat ); # set up the link handler sub my $link_extor = HTML::LinkExtor->new(\&handle_links); #get search results my $response = $browser->get($url); #get the links $link_extor->parse($response->content); #get items already seen in hash %olditems %olditems=(); if (-s $localfile) { open (INFILE,"$localfile"); while (<INFILE> ) { chomp; next if $_ eq ""; $olditems{$_}=1; } close (INFILE); } # delete items from %data hash already seen foreach $key (keys %olditems) { if (exists($data{$key})) { delete $data{$key}; } } # *** save any remaining new entries to file *** open (OUTFILE,">>$localfile"); my $mailbody=""; foreach $itemnumber (keys %data) { my $line=&get_title($data{$itemnumber}); print OUTFILE $itemnumber."\n"; #print "Line=".$line."\n"; $mailbody=$mailbody.$line; } close (OUTFILE); #send mail my $msg = MIME::Lite->new ( To => $email, From => $email, Subject =>"Ebay Search for [".$title."]", Type =>'multipart/related' ); $msg->attach(Type => 'text/html', Data => qq{ $mailbody } ); MIME::Lite->send('smtp', $mailsrv, Timeout=>60); $msg->send if $mailbody ne ""; ###################################### sub handle_links { my ($tag, %links)=@_; my $key; if ($tag eq 'a') { foreach $key (keys %links) { #search for links with Viewitem if ($key eq 'href') { if ( $links{$key} =~ m/ViewItem/) { #get the item number from the link $links{$key} =~ m/item=(\d+)/; $data{$1}=$links{$key}; } } } } } sub get_title($) { my ($page)=@_; my $itempage = LWP::UserAgent->new; my $item_contents=$itempage->get($page); my $p = HTML::HeadParser->new; $p->parse($item_contents->content); my $link="<p><a href=\"$page\">".$p->header('Title')."</p>"; return $link; }

Replies are listed 'Best First'.
Re: Scraping Ebay
by imp (Priest) on Aug 31, 2006 at 06:01 UTC
    In the future please only post the relevant code. Trim your examples down to the minimal case that still demonstrates the problem. See How (not) to ask a question - Only Post Relevant Code

    Here is a somewhat trimmed down version:

    use warnings; use LWP 5.64; use URI; use HTML::LinkExtor; my $base = "http://search.ebay.com/ws/search/SaleSearch"; my $browser = LWP::UserAgent->new; my $url = URI->new($base); $url->query_form( 'satitle'=> 'learning perl', 'sacat'=> 267, ); my %data = (); # set up the link handler sub my $link_extor = HTML::LinkExtor->new(\&handle_links); #get search results my $response = $browser->get($url); #get the links $link_extor->parse($response->content); exit; ###################################### sub handle_links { my ($tag, %links)=@_; if ($tag eq 'a') { foreach my $key (keys %links) { #search for links with Viewitem if ($key eq 'href') { if ( $links{$key} =~ m/ViewItem/) { #get the item number from the link $links{$key} =~ m/item=(\d+)/; printf "links{%s} = %s\n",$key,defined $links{$key +} ? $links{$key} : 'undef'; if (!defined $1) { die "\$1 is undefined."; } $data{$1}=$links{$key}; } } } } }
    Output:
    links{href} = http://cgi.ebay.com/Learning-Perl-by-Randal-L-Schwartz-1 +997_W0QQitemZ220021417901QQihZ012QQcategoryZ2228QQssPageNameZWDVWQQrd +Z1QQcmdZViewItem $1 is undefined. at test.pl line 43.
    The warning is being printed because while $links{$key} =~ m/ViewItem/ matches, $links{$key} =~ m/item=(\d+)/ does not.

    To avoid warnings like this you should always check that the regex was succesful before checking $1.

Re^2: Scraping Ebay
by merlyn (Sage) on Aug 31, 2006 at 11:49 UTC
    Not commenting on the rest of your code, my eyes were particularly drawn to this line:
    my ($a,$b,$line, $itemnumber,@title,$results,%data,%olditems,$key);
    First, i've already spoken out about my distaste for "my my my" as a warning signal (although super search is not letting me figure out how to search for it at the moment).

    Second, saying "my $a/$b" will break sort blocks, so in general, you should stay away from those.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Scraping Ebay
by zshzn (Hermit) on Aug 31, 2006 at 05:53 UTC
    Please be more descriptive in your issues, including error lines. As well, look on the site for formatting guides. That issue is because you haven't initialized a hash (who would have thought?).
Re: Scraping Ebay
by Asim (Hermit) on Sep 01, 2006 at 16:12 UTC

    Not to be a party-pooper, yet it seems what you're looking to do is already a part of any number of eBay-API-aware modules. A quick search on CPAN pulls up a few, and you might want to look to them for a more stable interface; eBay wont change APIs too often, but can and will change the web pages without a moment's notice.

    If you're new to Perl (I notice you mentioned getting this script from elsewhere), it might be a bit daunting, so forgive me if it's so.

    ----Asim, known to some as Woodrow.

Re: Scraping Ebay
by lv211 (Beadle) on Sep 01, 2006 at 01:56 UTC
    Thanks for your input guys.

    I'll try to work through this code and hopefully get the effect I want.

    I will also clean up my code a little bit more next time I post. ;)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://570524]
Approved by Skeeve
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-18 00:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found