|Come for the quick hacks, stay for the epiphanies.
Retrieve select information from HTMLby Smaug (Pilgrim)
|on Jul 18, 2013 at 19:06 UTC
Smaug has asked for the wisdom of the Perl Monks concerning the following question:
I have a a file of serial numbers for machines that I need to determine the warranty status of, as well as the model.
The serial numbers are in a file which looks like this:
Essentially I am searching each machine and retrieving the information on the model which exists in the meta-data at the start of the HTML as follows:
The problem I have now is in getting that warranty information, and some machines have more than one warranty, from somewhere in the middle of the HTML although in the rendering of the page it is shown in the block to the right.
For each serial number I need the Next Business Day and the date (e.g. 11/12/2012) which are in bold in a hash or an array, something like:
Any help would be appreciated. I did look at HTML::Miner and HTML::Tree but neither seemed to accomplish what I needed with my limited knowledge of HTML.
The longer serial number is a monitor and should be ignored, but I will handle that by not processing items with more than 7 digits in the serial number.
Peddle faster monkeys!! I need more power!!