Perl Monk, Perl Meditation | |
PerlMonks |
in reply to How to parse not closed HTML tags that don't have any attributes?
#!/usr/bin/perl use strict; use warnings; local $_ = do { local $/; <DATA> }; while( /<p class="title">(\w+)<\/p>\s*<p>([^<>]*)/g ) { my $title = $1; printf "%20s %s", $title, $2 =~ s/\s*\z/\n/r; } __DATA__ <div class="phone"> <div class="icon"></div> <p class="title">Telephone</p> <p>0123-4 56 78 90 <p class="title">Telefax</p> <p> </div>
Outputs:
Telephone 0123-4 56 78 90 Telefax
Well, it works for all the provided test cases :)
|
---|