http://qs321.pair.com?node_id=11129210


in reply to How to parse not closed HTML tags that don't have any attributes?

The HTML is indeed brokeninconsistent, and you've only shown one sample, so any example code will be correspondingly brittle. Like marto, I would suggest Mojo::DOM, as it has an IMHO nice interface, and it is still able to parse that HTML.

use warnings; use strict; use Mojo::DOM; use Mojo::Util qw/trim/; use Data::Dump; my $dom = Mojo::DOM->new(<<'HTML'); <div class="phone"> <div class="icon"></div> <p class="title">Telephone</p> <p>0123-4 56 78 90 <p class="title">Telefax</p> <p> </div> HTML my %hash = @{ $dom->find('p.title')->map(sub { return ( trim($_->text), trim($_->next->text) ) }) }; dd \%hash; __END__ { Telefax => "", Telephone => "0123-4 56 78 90" }

Update: Assuming you've got a lot of other <div>s in your HTML, you may want to change the expression in ->find() to '.phone p.title'.