The modules recommended by suaveant and bobf are a good bet. If you wanted to use HTML::TreeBuilder the following would be one way to do it.
#! /usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent=1;
use HTML::TreeBuilder;
my $t = HTML::TreeBuilder->new_from_file(*DATA);
my ($table) = $t->look_down(_tag => q{table});
my @rows = $table->look_down(_tag => q{tr});
my %db;
for my $row (@rows){
my $key = $row->look_down(class => q{rlab})->as_text;
my $value = $row->look_down(class => q{l})->as_text;
$db{$key} = $value;
}
for my $key (keys %db){
printf qq{%s -> %s\n}, $key, $db{$key};
}
__DATA__
<html><head><title>Person Profile</title></head>
<center>
<font size=5><b>Profile</b></font>
<table cellspacing="1" cellpadding="1">
<tr>
<td class="rlab">Short Name:</td>
<td class="l">John</td>
</tr>
<tr>
<td class="rlab">Long Name:</td>
<td class="l">John Abraham</td>
</tr>
<tr>
<td class="rlab">Company:</td>
<td class="l">Idea</a></td>
</tr>
</tr>
<tr>
<td class="rlab">Currency:</td>
<td class="l">EUR</td>
</tr>
</table>
</body></html>
Company: -> Idea
Long Name: -> John Abraham
Currency: -> EUR
Short Name: -> John
I've assumed that
- there is one table,
- each row has two columns each with a class as in your sample data
You would probably want to include some error checking to confirm those assumptions though.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|