Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Dear Monks,

Now that I am feeling relatively comfortable about sending my user/pass data over the net to the https server that I am logging onto, I am at Stage 3 of my quest to screen scrape my brokerage account ... getting past the embattlements that are trying to keep my WWW::Mechanize agent out.

So once again, I come to the Monks for direction.

I will start with the current state of my code ...

use strict; use LWP::UserAgent; use WWW::Mechanize; use HTML::TokeParser; use HTTP::Cookies; use HTTP::Request; use Data::Dumper; my $user = 'MyUserName'; my $pass = 'MyPassword'; my $dv_data = ''; my $output = ''; # Set up cookie jar my $cookie = HTTP::Cookies->new(file => 'cookie',autosave => 1,); my $mech = WWW::Mechanize->new(cookie_jar => $cookie, autocheck => 1,) +; my $uri = URI->new( 'https://wwws.izone.com/apps/LogIn' ); $mech->get( $uri ); die $mech->response->status_line unless $mech->success; $output = $mech->content; for ($output =~ /name=\"DV_DATA\" type=\"hidden\" VALUE=\"(.*?)\">/smi +){ $dv_data = $1; } $mech->form_name( 'li' ); $mech->set_fields( USERID => $user, PASSWORD => $pass, DV_DATA => $dv_data ); $mech -> submit(); print $mech->content;
Please assume that I am not fully clear on most concepts.

In the code above, I pulled out a value called DV_DATA from the webpage when I first pull it down. This is a "hidden" input variable that appears to be time-based and assigned when the user enters the log-in page. I then include it as an input. I am not sure this is correct.

With the cookies, I am not sure what is going on, but I've been reading around and what I put in seems to be the general consensus for acceptable code to include a cookie jar in WWW::Mechanize. I know I need a cookie jar, but other than that I do not know what I should be doing with it.

When I run the code, it outputs these two Not Found statements ...

Not Found The requested URL /cgi-bin/apps/u/Home was not found on this server. Apache/2.0.50 (Fedora) Server at www.server.com Port 80 Not Found The requested URL /cgi-bin/apps/u/EquityTrade was not found on this se +rver. Apache/2.0.50 (Fedora) Server at www.server.com Port 8
Has anybody seen any of this before? Am I going in the right direction? What are the things that I am forgetting to consider?

Thanks Again Monks,

Chris Herold

In reply to Authentication with WWW::Mechanize by cdherold

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-03-29 05:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found