http://qs321.pair.com?node_id=163224

costas has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a small script to perfrom a 'whois' on number of domains. i however have a slight query.

When i run 'use Net::ParseWhois;'on a particular domain I get the following
Registrar: NETWORK SOLUTIONS, INC. Domain: xxx Name: xxx Tag: xxx Address: xxx Country: xx Name Servers: xx Contacts: ADMINISTRATIVE: Cable and Wireless PLC UK Hostmaster (IH22-ORG) hostmaster@UK.CW.NET Cable and Wireless PLC 76 Hammersmith Road Hammersmith, London W14 8UD UK +44 (0) 20 7825 6000 Fax- +44 (0) 20 8243 1981 TECHNICAL: Cable and Wireless PLC UK Hostmaster (IH22-ORG) hostmaster@UK.CW.NET Cable and Wireless PLC 76 Hammersmith Road Hammersmith, London W14 8UD UK +44 (0) 20 7825 6000 Fax- +44 (0) 20 8243 1981 Record created:n/a Record updated:n/a
Note that the final line in this code puts record created and updated as 'n/a' and there is no expiry date!!!

When i run use 'Net::Whois::Raw;' with the same domain i get all information above plus an expiry/created/ and updated date (as shown below).
...Record expires on 25-Oct-2002. Record created on 24-Oct-1995. Datab +ase last updated on 1-May-2002 04:51:03 EDT...
Can anybody tell me how to get the date information using the use Net::ParseWhois; module since it fetches me the results in a more managable form. Or are there any other suggestions?

thankyou

Replies are listed 'Best First'.
Re: whois module query
by tadman (Prior) on May 01, 2002 at 09:19 UTC
    It would seem that 'n/a' just shows up whenever it can't grok what is going on. It could be a defect in the Netsol.pm parsing module, which can occur when data is misaligned. Here's the juicy bit from that module:
    sub regex_created { '^Record created on (.*).$' } sub regex_expires { '^Record expires on (.*).$' }
    Personally, I'd put \s* wherever there could be space-like characters, since you never really know what they're going to think of doing next. Stuff tends to float. You've practically got to assume that where there is one space, there might be dozens. This might be better written as:
    sub regex_created { '^\s*Record created on\s+(.*).' } sub regex_expires { '^\s*Record expires on\s+(.*).' }
    There's no trimming going on either, apparently:
    my $text = $self->_send_to_sock( $sock ); # ... $self->parse_text($text);
    No evidence of s/^\s+//, which I would've figured had to be in there somewhere since Netsol records are typically indented with spaces.

    I had a program that used to parse these things into little pieces, tearing apart even the address into city, state, zip, with a high degree of accuracy. I wonder if it's still workable.
Re: whois module query
by hatter (Pilgrim) on May 01, 2002 at 12:07 UTC
    The modules rely on each registrars whois template method being kept up to date, and NSI slightly changed their format recently. If you're using this module in production stuff, you might want to join up to the mailing list, you'll occassionally see people mentioning updates, and you can submit any of the changes you make to help everyone keep it working (I'm working on methods for some of the registrars not currently covered, as are several other people)

    the hatter