Dear Monks,
Now that I am feeling relatively comfortable about sending my user/pass data over the net to the https server that I am logging onto, I am at Stage 3 of my quest to screen scrape my brokerage account ... getting past the embattlements that are trying to keep my WWW::Mechanize agent out.
So once again, I come to the Monks for direction.
I will start with the current state of my code ...
use strict;
use LWP::UserAgent;
use WWW::Mechanize;
use HTML::TokeParser;
use HTTP::Cookies;
use HTTP::Request;
use Data::Dumper;
my $user = 'MyUserName';
my $pass = 'MyPassword';
my $dv_data = '';
my $output = '';
# Set up cookie jar
my $cookie = HTTP::Cookies->new(file => 'cookie',autosave => 1,);
my $mech = WWW::Mechanize->new(cookie_jar => $cookie, autocheck => 1,)
+;
my $uri = URI->new( 'https://wwws.izone.com/apps/LogIn' );
$mech->get( $uri );
die $mech->response->status_line unless $mech->success;
$output = $mech->content;
for ($output =~ /name=\"DV_DATA\" type=\"hidden\" VALUE=\"(.*?)\">/smi
+){
$dv_data = $1;
}
$mech->form_name( 'li' );
$mech->set_fields(
USERID => $user,
PASSWORD => $pass,
DV_DATA => $dv_data
);
$mech -> submit();
print $mech->content;
Please assume that I am not fully clear on most concepts.
In the code above, I pulled out a value called DV_DATA from the webpage when I first pull it down. This is a "hidden" input variable that appears to be time-based and assigned when the user enters the log-in page. I then include it as an input. I am not sure this is correct.
With the cookies, I am not sure what is going on, but I've been reading around and what I put in seems to be the general consensus for acceptable code to include a cookie jar in WWW::Mechanize. I know I need a cookie jar, but other than that I do not know what I should be doing with it.
When I run the code, it outputs these two Not Found statements ...
Not Found
The requested URL /cgi-bin/apps/u/Home was not found on this server.
Apache/2.0.50 (Fedora) Server at www.server.com Port 80
Not Found
The requested URL /cgi-bin/apps/u/EquityTrade was not found on this se
+rver.
Apache/2.0.50 (Fedora) Server at www.server.com Port 8
Has anybody seen any of this before? Am I going in the right direction? What are the things that I am forgetting to consider?
Thanks Again Monks,
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.