Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Outputting search results to .txt file

by mkwilson (Initiate)
on Apr 05, 2012 at 06:09 UTC ( [id://963593]=perlquestion: print w/replies, xml ) Need Help??

mkwilson has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am completely new to PERL and a little overwhelmed by all the documentation out there. My task: I need to search for specific parameters on a resume search site and save the resume results in a txt file. I can get the search to work correctly, but I do not know how the open, print, and close filehandle works with my while routine. I would really appreciate any assistance you can offer.

Code is below.

#!/usr/bin/perl use strict; use warnings; use LWP::Simple; use LWP::UserAgent; my $agent = LWP::UserAgent->new( agent => 'Mozilla/5.0 (X11; Linux x86_64 AppleWebKit/535.21 (KHTML, l +ike Gecko) Chrome/19.0.1042.0 Safari/535.21'); my $url = 'http://www.beyond.com/resumes/resume-search.asp'; my $response = $agent->post( $url, { 'FEducationLevel' => '2110', 'FExperienceLevelMin' => '2125', 'FExperienceLevelMax' => '2127', 'FSinceDate'=> '90', 'FKeywords'=> 'administrative assistant', 'FCountryState' => 'US~GA', 'Fmetros' => '12060'} ); my $html = $response->decoded_content; while ( $html =~ /"resumeDetailLink" href="([^"]+)"/sg ) {my $link_url = 'http://www.beyond.com' . $1; open FH,">>results.txt"; print $link_url . "\n"; close FH, "results.txt"; }

Replies are listed 'Best First'.
Re: Outputting search results to .txt file
by stevieb (Canon) on Apr 05, 2012 at 06:30 UTC

    Your file isn't changing for each pass of the while loop, so you can open it prior, and close it after.

    # note the three-arg use of open open my $fh, '+>', 'results.txt' or die "Can't open results.txt: $!"; while ( $html =~ /"resumeDetailLink" href="([^"]+)"/sg ){ my $link_url = 'http://www.beyond.com' . $1; print $fh $link_url . "\n"; } close $fh;

    UPDATE: I forgot to point out that the reason it didn't work originally was because you forgot to pass the file handle as a parameter to print(). Had you of done that, it would have worked. However, because you would have been opening and closing the file upon each iteration of the while loop, it would have been very inefficient.

      Thank you so much, stevieb, I see now how it works. I tried it with the open and close outside, also, but of course I needed print to have the right parameter.

      One follow-up question: The results file now gives me the urls from the resume results, rather than the contents of the results. How do I get it to save what it on the page of each url?

      And also, if you feel like it, is there a tutorial about the syntax for open and close you can point me to? The one that I read obviously showed using parentheses instead of the separate arguments.

        When I get a chance later tonight, I'll fiddle with your code and see if I can get it to do what I believe you desire it do to.

        In the meantime, here is the document you requested:

        perldoc -f open

        Cheers,

        Steve

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://963593]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-25 23:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found