http://qs321.pair.com?node_id=19094


in reply to Scanning a html document....

Check out HTML::Parser and/or HTML::TokeParser.