Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Character Counts

by Anonymous Monk
on Apr 07, 2003 at 15:16 UTC ( [id://248634]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Let me begin by saying I have never used Perl before. I have been tasked to create a script that will parse through records(this much I have done) and validate a telephone number field. I first want to count all the characters in that field, clean the field of all spaces and non numerics, and flag the ones which are not 10 or 4 digits. Then print a list with the old numbers, new numbers and flagged numbers. ANY insight will be GREATLY appreciated!!!!

Replies are listed 'Best First'.
Re: Character Counts
by BrowserUk (Patriarch) on Apr 07, 2003 at 15:37 UTC

    This might serve as a starting point, but what you are doing is fraught with problems. Even assuming that you are only interested in US phone numbers (4 or 10 digits), there are several conventions used by different groups of people when writing phone numbers, especially when the intended or perceived audience is international (ie. on the web), that will break when cleaned up using your rather simplistic mechanism.

    As an example, note the number in the sample code below that begins +44(0).... in which the +44 is the country code for the UK, but the bracketed zero is there to indicate that the remaining digits of the number must be prefixed by zero if dialled from within the UK. Simply stripping the non-digits from the number will result in an invalid number no matter where it is dialled. I am not sure if anyone in the US uses this convention, but it is fairly common in Europe.

    #! perl -slw use strict; my @fields = ( '1234', '01234567890', 'abcd1234xy', '012 345 6789', '+44(0)1234 567890', ); for (@fields) { (my $new = $_) =~ s[\D+][]g; print "Old:'$_' New:'$new'??" if length $new != 4 and length $new != 10; }

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.

      US numbers are very regular if you ignore extension numbers. The simple expression mentioned below handles all normal US cases that I just thought of while writing this.

      / ^ # Start (?: \D? \d{3} \D{,2} )? # Optional area code \d{3} \D? # Prefix with optional punctuation \d{4} # Last four digits /x
Re: Character Counts
by krujos (Curate) on Apr 07, 2003 at 15:33 UTC
    You should be a little more specific about what your phone numbers look like. To clear all no numeric data you could use s/\D//g which would leave you with no spaces or dashes, not to readable. To count you could use length.
Re: Character Counts
by Juerd (Abbot) on Apr 07, 2003 at 15:30 UTC

    I have been tasked to create a script that will parse through records(this much I have done) and validate a telephone number field. I first want to count all the characters in that field, clean the field of all spaces and non numerics, and flag the ones which are not 10 or 4 digits. Then print a list with the old numbers, new numbers and flagged numbers.

    What have you tried so far, and what went wrong? Or do you want us to write the script? If so, please mention your credit card details ;-)

    Juerd
    - http://juerd.nl/
    - spamcollector_perlmonks@juerd.nl (do not use).
    

Re: Character Counts
by Jenda (Abbot) on Apr 07, 2003 at 15:46 UTC

    I believe you want to cleant the number first and THEN check the length:

    # First let's remove all non-digits $phone =~ tr/0-9//dc; # next check the length if (length($phone) != 4 and length($phone) != 10) { # do something with an incorrect phone number } else { # do something with a correct phone number }

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature

Re: Character Counts
by Anonymous Monk on Apr 07, 2003 at 17:56 UTC
    Thank you all for your input, Here is what I have so far:
    #--------Clean Phone Number $newphone = $phone $newphone = =~ s/[^0-9]//g; #-----Length of cleaned Number $phonelength = length($newphone) #-----Update Counter and Print Output if (length($newphone) != 4 and length($newphone) != 10) { # add asterisks $emph = "****" } else { $emph = "" # print number Print PHONELIST, sprintf "%16d %100s", $phone[1], $abs_filename[2], + "\n"; Print SUMMARY, sprintf "%16d %12d %4s %100s", %ctr++; $phone, $newphone[1], $emph[2], $phonelength[3], $abs_filename[4] "\n"; }
    Now my problem is adding/updating a counter for the output.

    Added code tags and indents - dvergin 2003-04-07

      Now my problem is...

      • most of your statements are missing the final semicolon
      • "print" is not supposed to be capitalized
      • there shouldn't be a comma after the filehandle in the print statement
      • you have the wrong sigil on the variable named "ctr" ("%ctr" needs to be "$ctr")
      • you seem to be confused about the usage of sprintf, and how to provide values for placeholders in the format string (those numbers in square brackets next to the variable names don't belong there)
      • setting "$emph" to "****" doesn't do you any good, since you never print any record with that value.

      (whew -- it's all a bit of a mess) Better hit a book or two a little harder... To get you started, here's a cleaned up version:

      $newphone = $phone; $newphone =~ s/\D+//g; # NOT " = =~ " !! $phonelength = length( $newphone ); $emph = ( $phonelength == 4 or $phonelength == 10 ) ? "" : "****"; printf( PHONELIST "%16d %100s\n", $phone, $abs_filename); printf( SUMMARY "%3d %16d %12d %4s %100s\n", ++$ctr, $phone, $newphone, $emph, $abs_filename );
      And, please look at the "Site How To" and learn about the use of <code> tags when posting actual code -- it really helps.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://248634]
Approved by broquaint
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-19 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found