Don't worry about a present lack of experience - only Larry Wall was born with knowledge of Perl. The rest of us are acolytes.
The error being reported means that some code in the Mechanize module is attempting to find a subroutine named url on a variable with an undefined value. Since it's more likely this code has a bug rather than WWW::Mechanize, it implies you are either passing it bad values or calling it wrong. My best guess is that the Excel file is misformatted - replicating a parsing issue without the file is question is difficult. Try running the following code and see if the output gives you any indications of what lines in the file may be problematic.
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use Win32::OLE qw(in with);
use Win32::OLE::Const 'Microsoft Excel';
$Win32::OLE::Warn = 3; # die on errors.
+..
# get already active Excel application or open new
my $Excel = Win32::OLE->GetActiveObject('Excel.Application')
|| Win32::OLE->new('Excel.Application', 'Quit');
# open Excel file
my $Book = $Excel->Workbooks->Open("C:/Documents and Settings/rto5u/My
+ Documents/CV.xls");
# select worksheet number 1 (you can also select a worksheet by name)
my $Sheet = $Book->Worksheets(1);
foreach my $row (2..4)
{
foreach my $col (1..1)
{
# skip empty cells
next unless defined $Sheet->Cells($row,$col)->{'Value'};
my $URL = 'http://scholar.google.com/advanced_scholar_search';
my $FORM_NAME = 'f';
#print "Author Name: ";
#chomp ($AUTHOR = <>);
my $AUTHOR = "MD Li";
print "Author Name: $AUTHOR\n";
#print "Paper Title: ";
#chomp ($TITLE = <>);
my $TITLE = $Sheet->Cells($row,$col)->{'Value'};
print "Paper Title: $TITLE\n";
#print "$TITLE";
#my $TITLE = "Region-specific transcriptional response to chro
+nic nicotine in rat brain";
my $mech = WWW::Mechanize->new(stack_depth=>10);
$mech->get($URL) || die ("Could not connect to $URL.\n");
my $res = $mech->submit_form(
form_name => $FORM_NAME,
fields => {
'num' => 100,
'as_epq' => $TITLE,
'as_occt' => 'title',
'as_sauthors' => $AUTHOR,
'as_allsubj' => 'all',
},
);
while ($res && $res->is_success()){
my $content = $res->content;
#print $content;
while ($content =~ /<p class=g>(.*?)<\/font>\s\s\s/gs){
my $section = $1;
my $title = "";
my $citedby = 0;
# get title
$title = getTitle($section);
$title =~ s/<.*?>//g;
$title =~ s/…/\.\.\./g;
# get citedby #
$citedby = getCitedBy($section);
if ($citedby){
print "\"$title\"\nCited by: $citedby\n\n";
}
}
$res = $mech->follow_link( text_regex => qr/Next/i);
}
}
}
$Book->Close;
######################################################################
+#######
sub getTitle($){
my ($section) = @_;
my $title;
if ($section =~ /<span class="w">.*?<a href.*?>(.*?)<\/a><\/span>/
+s){ # papers with a link
$title = $1;
}elsif ($section =~ / (.*?)<font size=-1>/s){ # pa
+pers w/o a link
$title = $1;
}else{
$title = $1;
}
return $title;
}
#---------------------------------------------------------------------
+-------
sub getCitedBy($){
my ($section) = @_;
my $citedby;
if ($section =~ />Cited by (\d+)</s){
$citedby = $1;
}
return $citedby;
}
#---------------------------------------------------------------------
+-------
A couple notes on the code:
- The lines starting with #! are used to tell Unix-like systems how to interpret the file. They are only meaningful if they are on the first line of a file. The -w switch is equivalent to the warnings pragma.
- On your subroutines, you use prototyping behavior, i.e. the ($). This is supposed to tell the Perl interpreter what the argument list looks like. They are generally not used (see subroutine prototypes still bad?). If you are going to use them, the subroutines must be declared at the top of the file, i.e. before they are called in code. This just involves a copy-paste for you.
- The foreach indices on $row and $col may not correspond to the areas of the file you intend to loop over.
If the above does not elucidate your issue, I'll need to see the Excel file in order to debug further.