I'm trying to extract the values of specific attributes from various HTML elements using XPaths and HTML::TreeBuilder::XPath. Say I have an anchor, <a href="foobar.html">One Link</a>, and I would like to extract the value of the attribute "href" from it. That would be "foobar.html". Or if I have meta data, <meta name="description" content="foobar" />, then I would like to find the value of the attribute "content", which is "foobar", and where the attribute "name" has the value "description". I think I have the right XPath, as it works in other tools, but instead of giving me the value of the attribute "content" it gives me this error:
Can't locate object method "as_text" via package "HTML::TreeBuilder::XPath::Attribute" at ./x1.pl line 15.
What have I missed in the code below and how to tweak it?
#!/usr/bin/perl
use HTML::TreeBuilder::XPath;
use strict;
use warnings;
my $root = HTML::TreeBuilder::XPath->new;
$root->parse_file(\*DATA);
$root->eof();
for my $d ($root->findnodes('//html/head/meta[@name="description"]/@co
+ntent'))
{
print qq(D=\n);
print $d->as_text;
}
$root->delete;
exit(0);
__DATA__
<html>
<head>
<meta name="description" content="foobar" />
</head>
<body>
<h1>FOO</h1>
<p>Bar</p>
</body>
</html>