Gosh, I wish I even knew the difference between those HTML:: modules and how to put them to work! Given the examples in this thread, soon I'll have more than a hammer to do my scraping. :-)
In the meantime: when I go to the web page from the link via my IE browser and do a Ctl-A and Ctl-C and then paste the text into a Notepad screen, this particular output is quite comprehensible to my HTML-untrained eye (vs the HTML stuff), e.g.
impse400 (I3C) / 172.17.8.182
hp color LaserJet 4600
Information
<snip much miscellaneous info>
For highest print quality always use genuine Hewlett-Packard supplies.
+
BLACK CARTRIDGE
HP Part Number: HP C9720A 73%
Estimated Pages Remaining:
11025
(Based on historical black page coverage of 2%)
Low Reached:
NO
Serial Number:
35860
Pages printed with this supply:
4078
TRANSFER KIT
HP Part Number: HP C9724A 87%
Estimated Pages Remaining:
103856
Etc.
With my regex sledgehammer it would be straightforward to process this data. Oftentimes, when I look at the "pure text" version of a web page there aren't nearly as many nice hooks for sorting things out. But this is THIS case, and my question is: might there be a tool which emulates this action of select/copy/paste of a web page to automate the production of such text for follow-on regex processing?