Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Poor man's Foongrep for dutch telephonenumbers

by teabag (Pilgrim)
on May 28, 2003 at 09:44 UTC ( [id://261264]=CUFP: print w/replies, xml ) Need Help??

I frequently write down a telephonenumber and then forget to write down the name of the owner. It must have something to do with my brilliant sense of organisation <*coughs*>.

Anyway you used to have a program named Foongrep where you could search on dutch telephonenumbers, but sadly it has been discontinued (after a legal battle) for a long time now. I came upon a site that still offers this service, so I wrote this small commandline program. I learned much from Juerd's vandale.pl, that checks words in the dutch dictionairy.

#!/usr/bin/perl -w #gettel.pl - poor man's foongrep for dutch telephone numbers use strict; use LWP::UserAgent; my $nummer; if ( $ARGV[0] ) { $nummer = $ARGV[0]; } else { Jammer_hoor(); } if ( $nummer !~ m/\b\d{3}-\d{7}\b/ ) { Jammer_hoor(); } #get page my $url = "http://zoekopnummer.ath.cx/index.php?nummer=$nummer"; my $info = LWP::UserAgent->new->request( HTTP::Request->new( GET => $url ) )->c +ontent; #no more newlines! $info =~ s/\n//g; #substitute number $nummer =~ s/-//gi; if ( $info !~ m/Naam:/g ) { print "\nNo info found on that number\n"; exit; } #filter out all the adds and links $info =~ s/$nummer//gi; $info =~ s/<.*?>//gi; $info =~ s/\.\.\..*?\.\.\.//gi; $info =~ s/\{.*?\}//gi; $info =~ s/\(.*?\)//gi; $info =~ s/table.*?://gi; $info =~ s/1\ item.*?sp\;//gi; $info =~ s/2\ item.*?sp\;//gi; $info =~ s/\s+/ /gi; #format for easy cut 'n paste $info =~ s/Telnr: /\n/g; $info =~ s/Naam: /\n/g; $info =~ s/Adres: /\n/g; $info =~ s/Plaats: /\n/g; $info =~ s/Postcode: /\n/g; $info =~ s/Fax: /\nfax: /g; print "$info"; sub Jammer_hoor { print "Usage: Specify a dutch telephone number like\ngettel.pl 020-123 +4567\n"; exit; }

Replies are listed 'Best First'.
Re: Poor man's Foongrep for dutch telephonenumbers
by Juerd (Abbot) on Jun 01, 2003 at 01:01 UTC

    I learned much from Juerd's vandale.pl, that checks words in the dutch dictionairy.

    Too bad they (vandale) changed their HTML. Although it's less broken now, this means my original script no longer works. I have an updated version available, and will update the PM node soon. Update (200306081720+0200) - done.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Re: Poor man's Foongrep for dutch telephonenumbers
by Juerd (Abbot) on Jun 01, 2003 at 14:24 UTC

    use LWP::UserAgent;

    I should have used LWP::Simple and so should you :)

    This is copy/paste-programming, I think. I used LWP::UA because some earlier version did some extra work, but I guess you just copied my code. My bad code.

    sub Jammer_hoor {

    Subs should be declared before they're used, in my opinion. Especially when the name of the sub is useless. Why the capital J, by the way?

    if ( $nummer !~ m/\b\d{3}-\d{7}\b/ ) {

    So !!!!!!!!!!!!!!hallo wereld!!!!!!@#$@#$%^#$%^#$%^#"$%#$%#"$%"#$%#$%"#$%123-4567890!!!!!!!!!! is a valid Dutch phone number? Besides that, it's possible to have a three-digit area number and a 6-digit subscriber number. With 06 number's it's 2-8, even.

    if ( $info !~ m/Naam:/g ) {

    I think explicit m with // as delimiters is misleading. As misleading as any of q'' qq"" qx`` m//. Explicit code isn't always clearer.

    $info =~ s/\s+/ /gi;

    Why /i? Afraid there will be uppercased whitespace? For all your regexes: only use modifiers that are useful for that regex.

    #filter out all the adds and links $info =~ s/$nummer//gi; $info =~ s/<.*?>//gi; $info =~ s/\.\.\..*?\.\.\.//gi; $info =~ s/\{.*?\}//gi; $info =~ s/\(.*?\)//gi; $info =~ s/table.*?://gi; $info =~ s/1\ item.*?sp\;//gi; $info =~ s/2\ item.*?sp\;//gi; $info =~ s/\s+/ /gi; #format for easy cut 'n paste $info =~ s/Telnr: /\n/g; $info =~ s/Naam: /\n/g; $info =~ s/Adres: /\n/g; $info =~ s/Plaats: /\n/g; $info =~ s/Postcode: /\n/g; $info =~ s/Fax: /\nfax: /g;

    You can topicalize using for:

    for ($info) { #filter out all the adds and links s/$nummer//gi; s/<.*?>//g; ... }
    Saves a lot of typing.

    print "$info";

    Get rid of those quotes. You're stringifying a string, copying the value in memory more often than necessary.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      >I should have used LWP::Simple and so should you :)

      >This is copy/paste-programming, I think. I used LWP::UA
      >because some earlier version did some extra work, but I guess you
      >just copied my code. My bad code.

      Well, the script I wrote before this one used lynx to dump the file. I copied your use of modules to make it more platform-independant.

      sub runit { # first filter out the relevant telinfo: open TEL, "lynx -dump zoekopnummer.ath.cx/index.php?nummer=$nummer|"; open (LOG, ">$log"); while(<TEL>){ print LOG $_; }
      was part of my first version. I just changed the fetching of the page and used the, "in you own words", bad method you used.

      Just calling it copy/paste is not entirely fair I think, but hey, I gotta admit, you're a far better coder than me.

      >So !!!!!!!!!!!!!!hallo wereld!!!!!!@#$@#$%^#$%^#$%^#"$%#$%#"$%"#$%#$%"#$%123-4567890!!!!!!!!!! is a valid Dutch phone number?
      >Besides that, it's possible to have a three-digit area number and a 6-digit subscriber number. With 06 number's it's 2-8, even.

      Nope, it isn't. And according to testing on my windows machine and my linux machine this tiny utility doesn't think it is either.

      As for 06 numbers, they are not listed on this site , so it's pretty useless to check for them, and the same goes for servicenumbers, I guess that's what you're aiming at with the three-digit area number and a 6-digit subscriber number.

      You're totally right about the sloppy matches and about the the unnecessary "ignore case" substitution. So here's a rewitten version that fixes (most of) those mistakes and remarks.

      Oh, and using LWP::Simple ;)

      #!/usr/bin/perl -w #gettel.pl - poor man's foongrep for dutch telephone numbers use strict; use LWP::Simple; my $nummer; sub jammer_hoor { print "Usage: Specify a dutch telephone number like\ngettel.pl 020-123 +4567\n"; exit; } if ( $ARGV[0] ) { $nummer = $ARGV[0]; } else { jammer_hoor(); } if ( $nummer !~ m/\b\d{3}-\d{7}\b/ ) { jammer_hoor(); } #get page my $url = "http://zoekopnummer.ath.cx/index.php?nummer=$nummer"; my $info = get($url); #no more newlines! $info =~ s/\n//g; #substitute number $nummer =~ s/-//g; if ( $info !~ m/Naam:/g ) { print "\nNo info found on that number\n"; exit; } for ($info) { #filter out all the adds and links s/$nummer//g; s/<.*?>//g; s/\.\.\..*?\.\.\.//g; s/\{.*?\}//g; s/\(.*?\)//g; s/table.*?://gi; s/1\ item.*?sp\;//gi; s/2\ item.*?sp\;//gi; s/\s+/ /g; s/Telnr: /\n/g; s/Naam: /\n/g; s/Adres: /\n/g; s/Plaats: /\n/g; s/Postcode: /\n/g; s/Fax: /\nfax: /g; } print $info;
      Teabag
      Sure there's more than one way, but one just needs one anyway - Teabag

        Nope, it isn't. And according to testing on my windows machine and my linux machine this tiny utility doesn't think it is either.

        2;0 juerd@ouranos:~$ perl gettel.pl '!!!!!!!!!!!!!!hallo wereld!!!!!!@ +#$@#$%^#$%^#$%^#"$%# $%#"$%"#$%#$%"#$%123-4567890!!!!!!!!!!' No info found on that number
        Get rid of the \bs and get real anchors instead :) Also, the 4-digit area code issue is about area codes. Has nothing to do with the 090[069] service/sex/game numbers.

        112 People with sirens :) 00 800 x+ International toll free 00 x+ International anything 067[0-5] x+ Data services 06760 x+ Data: internet dial-up 0800 x+ Toll free 0900 x+ Information 0906 x+ Erotic 0909 x+ Entertainment 16xx .* Carrier select 0xx xxxxxxx Normal (084 and 087 are for PDA-ish services) 0xxx xxxxxx Normal
        Detailed information: http://www.ez.nl/beleid/home_ond/dgtp/beleidwetgeving/nummers_namen/documenten/numberingplan_netherlands.pdf

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://261264]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-18 20:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found