The humble phone number. Global, local, extensions, alternates, and sometimes pure garbage: Without data entry restraints there is no telling what you might find in a typical phone number data field. Until now.
The topic of phone number crunching has arisen at the Monastery before, multiple times, with answers and insightful speculations, but thus far all seem to have underestimated the complexity of the unrestrained Beast.
I come bearing loads of international phone number DATA found running rampant in the wilds of the data entry savanahs, plus my particular solution to the problem of making sense of it all. From a representative field of nearly 100,000 numbers I distilled a subset of über-patterns and their appearance frequencies.
What follows is my eventual solution for parsing these noisy numbers, a rather brute-force solution that evolved to fit the data at hand for a data set perhaps better suited for analysis by a neural net; my reflections on the nature of data entry, ambiguity, and alternate approaches; and most importantly, a scrambled but meaningful representative data set on which you are free to chew at your leisure -- I'm sure better approaches exist. I invite thoughts, code commentary, and better solutions.
warning: node size ~63k
I am here to chew bubble gum and parse some data...and I'm all out of bubble gum.
-- Me Nada
Contents
Below you will find my commentary, a basic script, and three modules used for my crunching. The example numbers are included below the __DATA__ portion of the script.
Phone numbers, like email addresses, are inherently impossible to prove "correct" merely by parsing. The only way to discover if a phone number is valid is to dial it and see what rings. More generally speaking, of course, there are general formats we expect to see, regardless of whether the number is indeed a valid number. Beyond the International Country Codes, however, numbers are subject to the vagaries and capacities of the host country's network. There are no guarantees, without knowing the rules for every country, what bits of a number apply to an area or province code, municipality, etc. The only parts of a number we can reasonably expect to identify in a globally generic way are:
- International Dial Direct codes (what locals use to dial out of their country)
- International Country Codes (what the world dials to reach a particular country after the IDD)
- The local phone number, including area/province codes and possibly long distance codes
- Extensions (to be dialed after a connection is made)
On top of this we should also expect indications of alternates for numbers, suffixes, and extensions in unconstrained data entry fields.
All is not doom and gloom for the country networks, however. If, for example, you happen to know that a large proportion of your numbers are supposed to be in a ten-digit format then you can use that information to infer information and rules of thumb, especially for parsing alternate suffixes.
1-900-ILOVEYOU: I have made no attempt to parse vanity numbers. First of all, none are represented in this data set. Second, I toyed with the idea and eventually decided that there was too much ambiguity involved with extracting extensions usually indicated by some combination of letters from 'extension', periods, hashes, and whitespace. I can see how distinguishing the difference might be done, but did not implement it since this data has loads of extensions but no vanity numbers.
Finally, there is one unavoidable fact: There are plenty of garbage entries that are either incomplete or incomprehensible even to a human. In a well-controlled universe this garbage would have been caught at the data-entry stage -- even a rudimentary attempt at enforcing validity would clean up much of the garbage. Such is not the case here, however, though those tasked with parsing the result may fervently wish it otherwise. Though no longer in vogue, the GIGO principle ultimately still stands for these cases.
As mentioned above, the original data comes from 100,000 or so international and U.S. domestic numbers entered into unconstrained entry fields. From these numbers I derived meta patterns with which to play:
Pattern Count | Generality |
1503 | single digits (\d), single alphas ([a-z]) |
1269 | single digits, single alphas, whitespace (\d, [a-zA-Z], and \s+) |
328 | digit clusters, single alphas, whitespace (\d+, [a-zA-Z], and \s+) |
312 | digit clusters, alpha clusters, whitespace (\d+, [a-zA-Z]+, and \s+) |
If extraction were the only goal, then the most general pattern collection at the bottom of that list would be sufficient for this particular data set. However, in this case we just have a raw data field that is supposed to have a phone number in it. Our job is to parse that number -- a more complicated task that presents its own set of challenges. Therefore in the __DATA__ section in the phone.pl script below I have included 1269 example entries that represent patterns derived from letting single alphanumerics (not clusters) float and collapsing whitespace.
These are not real phone numbers, except perhaps some by random chance. International country codes, where identifiable, were replaced with a random country code of identical length. 1's and 0's were largely left alone (due mostly to the presence of IDD codes) and the rest of the digits were sequentially overwritten. Text strings have been replaced with nonsense unless they are somehow generically germane (eg "Extension", "PAGER", "email only", " - xxxx", etc). The result is a set of fake but convincing numbers with valid country codes (when present) that each correspond to one of the patterns.
Each number is preceded by a percentage measuring its match frequency in the original data set. Though I did not use this information for parsing purposes, it is instructive to see that the majority of numbers fall into patterns that are reasonable to extract and that indeed, we need not fear for the collective skills of our world's typists. The percentages might also be useful to those seeking their own solutions -- either to form heuristics or to realize when "enough is enough" and be content with their 95%. Note that with a data set this size, percentages of less than 1% are common and can represent a significant number of entries.
There are three tasks involved with data such as this: extraction, normalization, and parsing. Though logically distinct, in reality they are hopelessly entangled. Much of the noise that gets dropped during normalization, for instance, can briefly serve as clues to the meaning of various parts of a phone number. My end result, therefore, is a series of steps, sequentially cohesive, that are executed in a specific order with each step passing the remnants of its operation to the next. The steps involved are a direct reflection of the nature of this particular data set.
Loosely stated, my approach boils down to the following steps:
- Split entry into multiple numbers, if present.
- Extract phone extensions.
- Remove IDD prefixes (possibly using them to infer upcoming country codes)
- Interpolate alternate suffixes into separate list of complete numbers.
- Extract country codes where present and map to appropriate numbers.
- Remaining data is likely to be the core number for a locale.
Some clarifications are in order. In step #3 I mention removing IDD prefixes -- the sequence of numbers used to dial out of a particular country. These are of little use to someone wanting to dial to a number if they do not happen to live in a country with that same IDD. Sometimes in the data there is no '+' to indicate an international number -- sometimes there is merely an IDD, usually some combination of 0 and 1 -- so these codes can be handy to infer the imminent arrival of a country code. I store the IDD's where found, but other than the inferal process they are of no particular use for my purposes.
In step #4 I mention interpolating numbers. Sometimes a number might be listed as something like '555-555-6666, 7777, or 8888'. There are three numbers there, all beginning with '555-555'. There are cases such as '555 555 6666-7' that present ambiguity: is the '7' an extension or an alternate ending to another suffix? In my solution that particular example is interpreted as an alternate suffix '6667'. In the original data it was more obvious because these tended to appear as numerically sequential numbers, i.e., adjacent suffixes. The scrambling of the data has destroyed some of its intuitive "look" and might cause you to wonder about my decisions in these ambiguous areas. These decisions are not bulletproof -- at the time they just seemed more likely to be correct.
Step #5 is perhaps the most interesting. I broke down and eventually came to rely on a list of actual, valid Country Codes. Mechanically detecting country codes can go only so far. Think "+44 555 666 7777" vs "+445556667777". With no knowledge of country codes, other than perhaps their typical maximum length of three digits and Huffman encoding, there is no bulletproof way of pulling the country code out of the second example. In addition the mechanical approach cannot deal with invalid country codes following a '+'. So in the CountryCodes package I provide some routines for pulling valid country codes out of a string of digits; in addition, there is a small routine for grabbing an updated list of codes off of the Net. There are two methods included: pull_cc_smart and pull_cc_guess (not used) that illustrate the difference.
As I mentioned, I expect that the dataset is the most valuable contribution here. My code is not optimized, tricky, or beautiful -- it is merely a straightforward evolution of a solution from data with pollution. (Am I a poet or what?)
The test script phone.pl is a simple harness around the data. For each line it will print the raw entry and extracted phone numbers, separated by a colon. In cases where multiple number were extracted, they appear on a line of their own below the first number found.
All code and data licensed under the same terms as Perl itself.
Enjoy,
Matt
Listing 1. PhoneParse.pm |
package PhoneParse;
# Attempts to parse international phone numbers found "in the
# wild" where operaters entered the numbers with no attempt at
# prior format enforcement. Only handles International Dial
# Direct codes, Country Codes, extensions, and the numbers
# themselves. No attempt is made to identify "area codes" if
# present in the number.
use strict;
use vars qw( @EXPORT $DEBUG );
use base qw(Exporter);
@EXPORT = qw( parse_phone );
use Carp;
# Home grown
use PhoneNumber;
use CountryCodes qw( is_country_code pull_country_code );
sub parse_phone {
# Attempt to normalize feral phone numbers Lots of sequential
# calls here, where the remnants of the last call are passed
# along to the next routine.
my $entry = shift;
return () if $entry =~ /^\s*$/;
# Normalize dividing cues that look like attempts to indicate
# alternat phone listings
$entry = normalize_alts($entry);
# Using those cues, crack into multiple phone numbers
my @raw_numbers = split_nums($entry);
my @numbers; # storage bin.
my @first; # cache for 1st country code and number
foreach my $raw_number (@raw_numbers) {
# Normalize oddball extension indicators
$raw_number = normalize_exts($raw_number);
# It helps to strip parens, as opposed to other noise, at
# this point. Sequential cohesion, anyone? Bleah.
$raw_number = zap_parens($raw_number);
# Grab the extensions.
my @exts;
($raw_number, @exts) = extract_exts($raw_number);
# Trim the fat from the ends
$raw_number = zap_border_noise($raw_number);
# Yank any International Dial Direct codes, that is, codes
# used to dial *out* of various countries. These are of no
# interest, although they can serve as an indicator of an
# upcoming country code.
my $idd;
($raw_number, $idd) = extract_idd($raw_number);
# Extensions have been clipped and IDD's plucked. Now look
# for stealth dash alternates -- that is to say, a single
# dash (normally ubiquitous) that is actually indicating
# alternate numbers. This will tank if the lone dash was
# meant to indicate an extension. Oh well.
$raw_number = normalize_dashslash($raw_number);
# Now we can interpolate over alternate suffixes, if any, and
# crack each number.
my @raws = interpolate($raw_number);
foreach my $raw (@raws) {
my($num, $cc) = extract_country_code($raw, $first[1]);
# Cache first found country code and number
@first = ($num, $cc) if !@first && $cc;
# Propogate first country code if appropriate. If the root
# numbers don't have the same digit count then we do not
# propogate the cc.
my $native_cc = 1;
if (!$cc && $first[1]) {
if ($num =~ tr/0-9// == $first[0] =~ tr/0-9//) {
$cc = $first[1];
$native_cc = 0;
};
}
$native_cc = 0 if @raws > 1 && $raw ne $raws[0];
push(@numbers, PhoneNumber->new(
num => $num,
cc => $cc,
ext => \@exts,
idd => $idd,
));
$numbers[-1]->_native_cc($native_cc);
}
}
# All done
@numbers;
}
sub normalize_dashslash {
# Normally dashes cannot tell us much, but if they are towards
# the end and the *only* use of a dash on a relatively long
# number, it's reasonabe to infer that that the dash is
# indicating alternative suffixes for a number. In this case
# just replace the dash with something more obvious: a slash.
my $raw_number = shift;
my $dash_count = $raw_number =~ tr/\-//;
return $raw_number unless $dash_count == 1;
my $total_dcount = $raw_number =~ tr/0-9//;
my($left, $right) = split(/\s*\-\s*/, $raw_number);
my($pre, $left) = $left =~ /(.*?)(\d+)$/;
my $pre_dcount = $pre =~ tr/0-9//;
return $raw_number unless $pre_dcount;
my $left_dcount = $left =~ tr/0-9//;
my $right_dcount = $right =~ tr/0-9//;
my $r_pct = $right_dcount/($pre_dcount + $left_dcount);
# If there are lots of digits, proceed. Also proceed on smaller
# digit streams if the righthand chunk is "big enough", guessed
# and gollied here at around 32%. This avoids simple numbers
# such as +dd dddd-dddd and +ddd ddd ddd-ddd
if ($total_dcount > 12 || $r_pct <= 0.32) {
# We have an "interesting" number
if ($right_dcount == $left_dcount || $right_dcount == 1) {
# Balanced lengths around a lone dash; probably an
# alternate ending. Otherwise an rlength of 1, which is an
# alt or an ext but we'll presume alt.
return join('/', "$pre$left", $right);
}
}
# No suspicious dashes
$raw_number;
}
sub zap_border_noise {
# Zap leading and trailing non-numerics (but not +)
my $raw_number = shift;
$raw_number =~ s/^[^\+\d]+//;
$raw_number =~ s/\D+$//;
$raw_number;
}
sub zap_parens {
# Zap parentheses
my $raw_number = shift;
$raw_number =~ s/[()]//g;
$raw_number;
}
sub extract_idd {
# Clip International Direct Dial Codes if they were entered
# instead of country codes. Specifically we go after
# combinations of leading zeros followed by ones: 00, 011,
# 0011, 010, etc. This does not cover all IDDs, but gets many
# of them. Oftentimes the country code will remain next in
# line. Note that we take special care *not* to clip a mere
# leading '1' or '001', the CC for the USA.
my $raw_number = shift;
my $original_number = $raw_number;
# Remove start noise, such as quotes and whitespace
$raw_number =~ s/^[^\d\+]+//;
# Isolate a '+' if present with dashes, additional pluses, or
# any other non-numeric cruft following it.
$raw_number =~ s/^\+\D*/\+/;
my $idd;
if ($raw_number =~ s/^\+(0+1{2,})/\+/) {
# Look for a + followed by zeros and at least two 1's, and
# replace with a '+'. By far the most common occurrence of
# this is '+ 011'.
$idd = $1;
}
elsif ($raw_number =~ s/^\+(0+)1/\+/) {
# Check for a + followed by zeros and a single 1. Here we
# also check the remaining digit count in order to guess
# whether we're dealing with the Country Code of the USA (1)
# or an IDD from within somewhere else.
my $digit_count = $raw_number =~ tr/[0-9]//;
if ($digit_count == 10) {
# CC for U.S.A.
$idd = $1;
$raw_number =~ s/^\+/\+1/;
}
else {
# IDD from within somewhere else?
$idd = "${1}1";
}
}
elsif ($raw_number =~ s/^\+(0+[01]*)/\+/) {
# Wrap-up for mandatory '+' sightings: Replace a plus
# followed by zero followed by any combination of 1's and 0's
# with just a '+'.
$idd = $1;
}
elsif ($raw_number =~ s/^(0+[01]*)/\+/) {
# Infer '+' for remaining 01 combinations with no '+', in
# particular '00'.
$idd = $1;
}
else {
# No idd found
return $original_number;
}
# Booty
$raw_number =~ /\d/ ? ($raw_number, $idd) : $original_number;
}
sub normalize_exts {
# attempt to normalize odd ext indicators to a single 'x'
my $raw_number = shift;
$raw_number =~ s/[\#\*]+\s*(\d+.*)$/x$1/;
$raw_number;
}
sub extract_exts {
# Extract extensions. Multiple extensions are assumed to be
# indicated with slashes of some sort (hence the earlier
# normalizing attempt)
my $raw_number = shift;
return $raw_number unless $raw_number =~ /\D/;
my @exts;
if ($raw_number =~ s/[xX]+\D*(\d+.*)$//) {
my $ext = $1;
$ext =~ s/[^\d,\/\\\|]+//g;
@exts = split(/\D+/, $ext);
}
# clean up non-numeric extension debris
$raw_number =~ s/\D+$//;
($raw_number, @exts);
}
sub normalize_alts {
# attempt to normalize delimeters for alternate numbers or
# extensions
my $entry = shift;
return $entry unless $entry =~ /\D/;
$entry = lc($entry);
$entry =~ s/\s+or\s+/\//g;
$entry =~ s/\s*[,;\|]\s*/\//g;
$entry;
}
sub extract_country_code {
# Yank or infer country codes if possible
my($raw, $cc_known) = @_;
my($num, $cc);
if ($raw !~ /^\+/) {
# No '+', see if the first number group looks like a country
# code. If so make sure there are enough digits in the number
# to make sense with a country code.
if (!$cc_known && ($raw =~ /^(\d+)[\s\-]+\d+/
&& is_country_code($1))
|| $raw =~ tr/0-9// > 10) {
($num, $cc) = pull_cc_smart($raw);
}
else {
# No country code to pull. Just strip non nums.
$num = $raw;
$num =~ s/\D+//g;
}
}
else {
# There was a leading '+' so we'll have a go at pulling a
# country code, even if there is no valid one present.
($num, $cc) = pull_cc_smart($raw);
}
# Booty
($num, $cc);
}
sub pull_cc_smart {
# Yank country codes by scanning for valid country codes
my $raw_number = shift;
$raw_number =~ s/\D+//g;
my($num, $ccode) = pull_country_code($raw_number);
($num, $ccode);
}
sub pull_cc_guess {
# Attempt to mechanically yank country codes without any
# information on what represents a valid cc.
my $raw_number = shift;
my $pat = qr/^\s*\++[\s\+\-]*(\d+)/;
my($ccode) = $raw_number =~ /$pat/;
$raw_number =~ s/$pat// if defined $ccode;
$raw_number =~ s/\D+//g;
($raw_number, $ccode);
}
sub split_nums {
# Attempt to detect and split multiple numbers.
my $raw_number = shift;
my @numbers;
if ($raw_number =~ tr/0-9// > 18) {
if ($raw_number =~ /\+[^\+]+([^\+\s]\s*\+)/) {
# Attempt to split on '+' in cases where there
# are multiple country codes.
@numbers = split("\Q$1\E", $raw_number);
map($numbers[$_] = "+$numbers[$_]", 1..$#numbers);
}
else {
# Otherwise go for slashes and length ratios. We
# guess/golly chunk lengths of 9 digits or larger.
my @chunks;
($numbers[0], @chunks) = split(/\s*[\\\/,]+\s*/, $raw_number);
foreach my $chunk (@chunks) {
my($ext_guard) = $chunk =~ /^([^x]+)/i;
my $chunk_digits = $ext_guard =~ tr/0-9//;
if ($chunk_digits >= 9 && $numbers[-1] =~ tr/0-9// >= 9) {
push(@numbers, $chunk);
}
else {
$numbers[-1] .= "/$chunk";
}
}
}
# Check for numbers such as +1 555 555 5555 1 444 444 4444
# This is some hard-coded US-centric whack for sure.
if (@numbers <= 1) {
my @chunks = split(/[\s\+\-]+1[\s\+\-]+/, $raw_number);
shift @chunks if $chunks[0] =~ /^\s*$/;
if (@chunks >= 2) {
@numbers = ();
foreach (@chunks) {
if (tr/0-9// >= 8) {
push(@numbers, "+1 $_");
}
else {
$numbers[$#numbers] .= " +1 $_";
}
}
}
}
}
# Booty
@numbers ? @numbers : $raw_number;
}
sub interpolate {
# Many times multiple suffixes, rather than whole numbers, are
# indicated by slashes, etc. We take these suffixes and join
# them to their common prefix.
my $raw_num = shift;
# Split on our chosen delimeters ( '/' or '\')
my($base, @frags) = split(/\s*[\\\/]+\s*/, $raw_num);
return $raw_num unless @frags;
# Pull the digits from the first alternate in order to capture
# the digit count
my($nchunk) = $frags[0] =~ /(\d+)/;
return $raw_num unless defined $nchunk;
# Check to make sure our prefix isn't a shorter stub. If it is,
# interpolation makes no sense.
my $base_dcount = $base =~ tr/0-9//;
my $chunk_dcount = $nchunk =~ tr/0-9//;
return $raw_num if $base_dcount <= $chunk_dcount;
# Using that length, pull the root number from the string
# containing that root plus the *first* alternative.
my($prefix) = $base =~ /(.*)\d{$chunk_dcount}$/;
# Entirely separate numbers if no prefix.
return $raw_num unless defined $prefix;
# Apply the prefix to the remaining alternatives. We drop
# duplicates. In the real world this probably meant that the
# alternative presented was an extension rather than an
# alternative. Oh well.
my @interpolated;
my %seen;
foreach ($base, map("$prefix$_", @frags)) {
next if $seen{$_};
push(@interpolated, $_);
++$seen{$_};
}
# Booty
@interpolated;
}
1;
|
Listing 2. CountryCodes.pm |
package CountryCodes;
use strict;
use Exporter;
use base qw(Exporter);
use vars qw( @EXPORT @EXPORT_OK );
@EXPORT = qw( is_country_code pull_country_code );
@EXPORT_OK = qw( initialize_from_net );
use Carp;
### Initialization
my @Icodes = qw(
1 20 212 213 216 218 220 221
222 223 224 225 226 227 228 229
230 231 232 233 234 235 236 237
238 239 240 241 242 243 244 245
246 247 248 249 250 251 252 253
254 255 256 257 258 260 261 262
263 264 265 266 267 268 269 27
290 291 297 298 299 30 31 32
33 34 350 351 352 353 354 355
356 357 358 359 36 370 371 372
373 374 375 376 377 378 380 381
385 386 387 389 39 40 41 420
421 423 43 44 45 46 47 48
49 500 501 502 503 504 505 506
507 508 509 51 52 53 54 55
56 57 58 590 591 592 593 594
595 596 597 598 599 60 61 62
63 64 65 66 670 672 673 674
675 676 677 678 679 680 681 682
683 684 685 686 687 688 689 690
691 692 7 808 81 82 84 850
852 853 855 856 86 870 871 872
873 874 878 880 881 886 90 91
92 93 94 95 960 961 962 963
964 965 966 967 968 970 971 972
973 974 975 976 977 98 992 993
994 995 996 998
);
my $Fresh_Codes_url = 'http://kropla.com/dialcode.htm';
my(%Icodes, %Icodes_by_length, %Icodes_huff);
initialize();
sub initialize {
# Set up data structures -- handy when we want to update with
# fresh codes off the Net.
if (@_) {
@Icodes = @_;
}
%Icodes = %Icodes_by_length = %Icodes_huff = ();
grep(++$Icodes{$_}, @Icodes);
foreach my $code (@Icodes) {
my $l = length $code;
$Icodes_by_length{$l} ||= [];
push(@{$Icodes_by_length{$l}}, $code);
}
foreach my $l (keys %Icodes_by_length) {
@{$Icodes_by_length{$l}} = sort @{$Icodes_by_length{$l}};
}
foreach my $code (@Icodes) {
my @digits = split(//, $code);
my $str = join('}{', @digits);
eval "++\$Icodes_huff{$str}";
}
}
### Accessors
sub is_country_code {
my $code = shift;
return unless $code;
$Icodes{$code};
}
sub country_codes_of_length {
my $l = shift;
return unless $Icodes_by_length{$l};
@{$Icodes_by_length{$l}};
}
sub random_country_code_of_length {
my $l = shift;
return unless $Icodes_by_length{$l};
$Icodes_by_length{$l}[rand(scalar @{$Icodes_by_length{$l}})];
}
sub pull_country_code {
# Given a string of digits, pull a matching country code
# from the beginning and return the resulting code and
# remaining digits.
my $number = shift;
return unless $number;
croak "Non numeric data\n" unless $number =~ /^\d+$/;
my @digits = reverse split(//,$number);
my @pulled;
my $ptr = \%Icodes_huff;
while (@digits) {
$_ = pop @digits;
last unless $ptr->{$_};
push(@pulled, $_);
$ptr = $ptr->{$_};
last unless ref $ptr;
}
my $cc = join('', @pulled);
my $left = join('', reverse @digits);
return $number unless $left =~ /\d/;
return($left, $cc);
}
### Get new country codes from the Net
sub initialize_from_net {
# earlier versions of TE will not work for this site
require LWP::Simple;
eval "use HTML::TableExtract 1.08";
die "Oops: $@\n" if $@;
my $html = LWP::Simple::get($Fresh_Codes_url);
my $te = HTML::TableExtract->new
( headers => ['Country\s+Code', 'Country\s+Name'],
br_translate => 1,
);
$te->parse($html);
my(@ccodes, %seen);
foreach my $row ($te->rows) {
my($cruft, $country) = @$row;
my($code) = $cruft =~ /^\s*(\d+)/;
next unless defined $code;
next if length $code > 3;
next if $seen{$code};
push(@ccodes, $code);
++$seen{$code};
}
initialize(sort @ccodes);
}
1;
|
Listing 3. PhoneNumber.pm |
package PhoneNumber;
# Simple class to store various bits of a phone number and roll
# them out as a string when needed.
use strict;
use Carp;
my @Valid_Parms = qw( num idd cc ext );
my $Ppat = join('|', @Valid_Parms);
sub new {
my $class = shift;
my %parms = @_;
foreach (keys %parms) {
croak "Invalid parameter '$_' passed.\n" unless /^$Ppat$/o;
}
my $self =\%parms;
$self->{ext} ||= [];
$self->{_native_cc} = 1;
bless $self, $class;
}
sub number {
my $self = shift;
if (@_) {
$self->{num} = shift;
delete $self->{_chunked};
}
$self->{num};
}
sub idd {
my $self = shift;
@_ ? $self->{idd} = shift : $self->{idd};
}
sub _native_cc {
# hack for scrambling original dataset
my $self = shift;
@_ ? $self->{native_cc} = shift : $self->{native_cc};
}
sub country_code {
my $self = shift;
@_ ? $self->{cc} = shift : $self->{cc};
}
sub extensions {
my $self = shift;
if (@_) {
@{$self->{ext}} = @_;
}
@{$self->{ext}};
}
sub chunked_number {
# Regurgitate a phone number with a 4 digit grouping last,
# preceded by 3 digit groups prior to that.
my $self = shift;
my $num = $self->number;
return unless defined $num;
if (!$self->{_chunked}) {
# Optimize for chunking
$num = reverse $num;
my @tphn;
if ($num =~ s/^(\d{1,4})//) {
push(@tphn, $1);
}
push(@tphn, $num =~ /(\d{1,3})/g);
# Undo reversals
grep($_ = reverse, @tphn);
@tphn = reverse @tphn;
# Cache
$self->{_chunked} = \@tphn;
}
@{$self->{_chunked}};
}
sub as_string {
# Attempt some nice formatting
my $self = shift;
my $str;
my $icode = $self->country_code;
$str = "+$icode " if $icode;
$str .= join(' ', $self->chunked_number);
my @exts = $self->extensions;
$str .= (' x ' . join('/', @exts)) if @exts;
$str;
}
1;
|
Listing 4. phone.pl |
#!/usr/bin/perl
use strict;
use FindBin;
use lib $FindBin::Bin;
use PhoneParse;
my $Show_Line_Count = 1;
my $col1w = 35;
while (<DATA>) {
chomp;
s/^\s*\S+\s+//; # clip pct
next if /^\s*$/;
my $entry = $_;
our $attempt;
++$attempt;
printf("%4d. ", $attempt) if $Show_Line_Count;
printf("%${col1w}s : ", $entry);
# Main voodoo
my @numbers = parse_phone($entry);
foreach (0 .. $#numbers) {
if ($_ > 0) {
printf("%6s", ' ') if $Show_Line_Count;
printf("%${col1w}s ", ' ');
}
print $numbers[$_]->as_string, "\n";
}
}
__DATA__
18.5765% +1 234 567 8923
14.3681% 234-567-8923
9.0357% +21 30 450 6789
7.6921% +21 10 3456 7189
5.6347% +20 1 3456 7809
4.1844% +00 121 340 5670
3.7609% +23 1456 71 8000
2.7520% (203) 456-7000
2.1150% +203 4567892
2.0478% +01 2345 6789
1.5463% +21 134 56 7892
1.4852% +231 45 6789
1.2932% +23 4 560 0070
1.2176% +23 11 4516078
0.9285% +23 4 567 8000 ext 9000
0.8841% +23 1 45 67 1819
0.8493% 0231
0.6802% +010 231 4516
0.6658% +20 1 3456 07890
0.6550% +21 3 45 16 71 89
0.6202% +234 5678 9234
0.6094% 0123456107
0.5938% 234-567-8901 x 2304
0.5302% +21 0134 567189
0.5122% +02 3 4516789
0.4775% +21 134 567 892
0.4547% +20 034 516780
0.4379% +234 1 5601700 ext 0189
0.4151% +1 234 567 8902 ext 3001
0.4091% +23 41 5611 000
0.3035% +20 1 03456718
0.2903% +02 34156789
0.2759% +21 30 405 111
0.2735% +12314567892
0.2699% +020 3 456 7089
0.2627% +213 45 67 8901
0.2567% +234 1-5601700 80192
0.2507% +21 0345 678 902
0.2207% +234 506 78 9020
0.2027% +20 31 456171
0.2015% +21 10 345 61789
0.2015% +210103456011
0.1943% +21 10 34111567
0.1931% +23 04 50016 708
0.1871% 203 456-0780
0.1871% +1 234-567-8192
0.1871% +23 41 560 78 90
0.1727% 1-234-567-1189
0.1703% +231 41 567 8923
0.1548% +23 41 5601711 ext 1089
0.1428% +21 345 67 100
0.1404% +23 040 5678 9010
0.1344% +23 4 56789 2340
0.1320% +23 145678900
0.1308% +21 341 5 60781
0.1296% +21 345 67 89011
0.1176% 203.456.7018
0.1140% +21 30 40 5678
0.1128% +23 45 6789 2034 ext. 1156
0.1128% +234 567 08 09 ext 2345
0.1092% +213 1 450067 ext 1008
0.1080% +21 030 4056781
0.1008% +21 134 56 78 90
0.0996% +23 4 516 78 92
0.0972% +1 234 561-7892
0.0924% + 21 10 341 5067
0.0912% +234 51 6701111
0.0900% +21 0300 405 6780
0.0876% +234 51 6708111 ext 9111
0.0852% +234 51670018 ext 9210
0.0852% +230 4 56078 923
0.0840% 020 304 5678
0.0816% +23 40 5678-9200
0.0804% +21 0345678902
0.0780% +23 1 40115 ext 600
0.0780% +23 1 45678 ext. 9023
0.0768% +234 5 67 8920
0.0756% +230 4 50116789
0.0756% +1 213 456 1789 X2300
0.0732% +234 5 67 111
0.0696% +203 450 6111 ext 1007
0.0684% +21 010 3456 789
0.0660% +2 01 3041506
0.0648% +231 4567892 ext. 340
0.0648% (213)456-1178
0.0648% +213 1 450067
0.0636% +01 23 45 67 89
0.0624% +21 3415 60117
0.0600% +2011 340056
0.0600% +0123451678
0.0600% +23 1 4567800 ext. 901
0.0588% +21 345 607809 ext. 203
0.0588% + 1 234 501 6078
0.0588% + 21 010 341 5016
0.0576% +1-231-456-7189
0.0564% +23 10 45067801 ext.9023
0.0564% +23 11 4150 6000 ext 7819
0.0564% +23 04 51101 ext. 6789
0.0552% +234 51670018
0.0540% +231 41 5116 789
0.0540% +1 234 567 8902 ext. 3456
0.0540% +2034 501 111
0.0516% +23 141 506 7000 ext. 8092
0.0504% +234 5111611 ext 7189
0.0504% +234 1 506 7 819
0.0504% +234 1 5617180
0.0492% 01213
0.0492% + 23 04 5110 6007
0.0480% +234 567892 ext 3456
0.0480% +23 01 45 67 18 92
0.0468% 0121 304 5607
0.0468% (203) 456 0781
0.0444% +231 4 5678 9123
0.0444% 23-456-0700
0.0444% +23 45106 718
0.0444% +23 4567 89230 ext 111
0.0432% 100
0.0420% +23 1 4567 89 02
0.0420% 23-145-678-9200
0.0420% +23 1 4567800 ext. 9234
0.0420% +23 045110 6718
0.0408% +230 456078
0.0396% 23-4-567-8911
0.0372% +1 234 5678923
0.0360% +020-3045167
0.0360% +23 456 780 00 ext 10
0.0360% +23 0 141 567 8923
0.0348% +23 4567 89 000
0.0348% +23 04 506789 20
0.0348% 2304 5000
0.0336% +234 1 5617180 ext. 9010
0.0336% +23 0456 7892 31
0.0336% +21-10-3456789
0.0336% +02 341 5678
0.0336% +0121 304 5067
0.0312% +23 01456 718000
0.0312% +23 45607892 ext. 3401
0.0300% +231 45678 9123
0.0300% +20 341 56789
0.0300% 201-345-6700 ext. 892
0.0288% +23 11451 6100
0.0288% +213 45 678923
0.0288% +1 23 4567 8092
0.0288% + 23 1 45 06 07 81
0.0288% +234-56078902
0.0276% +21 341 560 70089
0.0264% +23 1 4567800 ext. 11923
0.0264% +23 40 56789-110
0.0264% +2345 6007892 ext. 100
0.0252% 01203410567
0.0252% +1 (203) 456-7892
0.0252% + 203 456 0789
0.0252% +231 415116718
0.0252% 020 3456 0170
0.0240% +21-10-341-5678
0.0240% +21 10 34 56 708
0.0240% +1 231 4 56 1789
0.0240% +23 40 5678923 ext. 1401
0.0228% +203-4516-7819
0.0228% +23 04 51101 ext. 601
0.0228% +234 5 678 92 31
0.0228% +2301
0.0216% 123-1456
0.0216% +23 10 45067801 ext. 9023
0.0216% +203-456 0708
0.0216% + 23 1 4567 8923
0.0204% +23 04 5110 ext. 6789
0.0204% +231 4567892 ext. 3456
0.0204% 01234 516789
0.0204% +2 031 456789
0.0192% +23 (0)40 5617 8902
0.0192% +20 1 3456 710
0.0192% 203- 456-7892
0.0192% +20 1 345617
0.0192% +234 567 80923
0.0192% +2 03 451 6708
0.0192% +23 0145 67 89 20
0.0192% +23 4 05 6789 2345
0.0180% + 21 10 3045678
0.0180% +23 411 567892 13
0.0180% +23 4 5678920 ext 3401
0.0180% +23 451 67890 11
0.0168% +21 3 45 167 892
0.0168% +23 10 45067801-9023
0.0168% +23-04-5110-6708
0.0168% +23 415678920 ext. 3415
0.0168% +1 234 567 8923 x4500/IR 61
0.0168% 203-456-7000 x 101
0.0168% 213-456-0178 x911
0.0168% +21 10 304
0.0156% +23 01456 71 8921
0.0156% +230 4567
0.0156% +23 40 5678 -9234
0.0156% +02 3405670
0.0156% 231/456-7890
0.0156% 234-567-8921 x3415
0.0156% +203-4516 7892
0.0156% +23 141 560 78 92
0.0156% +2110341 5678
0.0156% +21-10-341 5601
0.0156% +23 141 506 7000 ext 8092
0.0156% 010 2345167
0.0144% 23456789
0.0144% +234 56789
0.0144% +234 1 5067 801
0.0144% +23 4567 89-2345
0.0144% 203-456-7189 x 10
0.0144% +23 45 671-8923
0.0144% +2 01 03041560
0.0132% +231-456-7892
0.0132% +23 04 5110.6789
0.0132% +23 04 51101 6789
0.0132% +234 506 781
0.0132% +23 45067 80921
0.0132% +011 23 40 5678 1092
0.0132% +21 31 4506 00710
0.0132% +23 4567 809 0
0.0132% +1 000 000 000
0.0132% +234 567 8923 405
0.0132% +20 31 45678
0.0132% 201-345-6700 x18
0.0120% +23 456 708900 ext. 2301
0.0120% +23 14 1 567 8923
0.0120% +20-34-567891
0.0120% +23 (0)405 678 9100
0.0120% + 21 0 341 5607
0.0120% +2311 4516700
0.0120% +23 1 4567 ext. 8923
0.0120% 234-567-
0.0120% 2134560178EXT923
0.0108% 234-567 8092
0.0108% 1 213 451 6780
0.0108% +1 231456 7811
0.0108% + 231 4567 1892
0.0108% + 23 040 5678 1923
0.0108% +20 3 4516-7892
0.0108% +21 0 10 341 5011
0.0108% +23 4567 8000 ext 9020
0.0108% 231.456.0789 x10
0.0108% +001 231 456 7892
0.0108% +23 4560 100
0.0108% 234-567-8923 Ext. 10
0.0108% + 23 1456 718923
0.0108% 2034567891ext11
0.0108% +23 1456 789230 ext. 4506
0.0096% +23 11 40567891 ext 2340
0.0096% +23 45 678 9230 10
0.0096% +01-231-456-7189
0.0096% +0021345678923
0.0096% +2304 5110 6789
0.0096% + 01 23 40 56 78
0.0096% +21 3 110 451 67
0.0096% +231 415 116 789
0.0096% 2034567
0.0096% +203 4 561171 ext 181
0.0096% +23.40.5678-9230
0.0096% + 23 1456 789 234
0.0096% +234 5 600 7890 ext 120
0.0096% +203 - 4516 7892
0.0096% +21 10 3456 0
0.0096% + 21 3 45 16 7892
0.0096% +20 1 34 56 000
0.0096% +231 450101 ext. 600
0.0096% +23 4 567 8000 ext 000
0.0084% +23-4567 8923
0.0084% +23 04 51101 ext. 06789
0.0084% +21-0345-678902
0.0084% + 21 34 567 89 23
0.0084% +21 10 345 6700 ext. 8900
0.0084% +23 456 70 0 8900
0.0084% +231456 789201
0.0084% + 20 1 3456789
0.0084% 01234 56 7809
0.0084% +231 456 78 92
0.0084% +23 4 50016 718
0.0084% +231 41 511 67 18
0.0084% 234-567-8902 ext 103
0.0084% +1 234 501
0.0084% +23 1 4567
0.0084% +23 045 6780 9123 ext. 4005
0.0072% 213/4567018
0.0072% +23 4567 89 21
0.0072% +21 134-567892
0.0072% +23 45 670 0000 ext. 00080
0.0072% +234 1-5601700 -89234
0.0072% +203 4516-7892
0.0072% +21 0300 4056780
0.0072% +20 34 56 7 892
0.0072% +023405607
0.0072% +23 40 560 710 89
0.0072% + 23 1 4560 708
0.0072% +2(3456)708902
0.0072% +23 04 5110 - 6708
0.0072% +23 1 4567800 ext. 9 234
0.0072% +234 51 11 61 781
0.0072% 234-50-67-89-213
0.0072% 023-4567892
0.0072% +21 345 607809 ext 234
0.0072% +21 30 405 0
0.0072% +23 145 67 89 0
0.0072% +23 40 5678 9200 3456
0.0072% + 21 3 45167892
0.0072% +23 1 45 678923
0.0072% +23 456 780 00 ext 0
0.0072% +213 41 56 07 89
0.0072% +21 0 345 678 912
0.0072% +1 234 5671
0.0072% +23 4 567 0891 ext.1231
0.0072% +23 0145 678911 ext 234
0.0072% +21 345 607809 0000
0.0072% 234-567-8921 x 3
0.0072% 01 23 45 61 17
0.0072% +21 134 567 0
0.0060% +23 01456 789 110
0.0060% +23 40 560710-11
0.0060% +1 234 567 8923 x4567/8923
0.0060% +23 1 0400 516 789
0.0060% +23 4567-8912
0.0060% +23 141 567 ext 1811
0.0060% 21-30-411-1506
0.0060% 2134516789x213
0.0060% +00 23 141 506 7892
0.0060% +02 341 561 78
0.0060% +21 31 4506 78 91
0.0060% +20034 501106
0.0060% 230 0451
0.0060% 00-234-5-678-9234
0.0060% +23 04567892314
0.0060% +234 567892 ext 34
0.0060% +23 1 4567800 ext. 0-920
0.0060% +1 2314567189
0.0060% 01234 567 892
0.0060% +23-04-51106780
0.0060% +21-345-607891
0.0060% +2130 405 67 89
0.0060% 0121 3456708
0.0060% +234 506708 ext. 9203
0.0060% +234 1 506 789
0.0060% 1234567081ex109
0.0060% +23 141 560 -1789
0.0060% +203 - 451 6789
0.0060% +21 345-6 78923
0.0060% +2 013456710
0.0060% 231456-7892
0.0060% +00 21 30 411 5167
0.0060% + 23 4567 8921
0.0060% +234 1 506 07181
0.0060% +21 30 4100
0.0060% +2 034 5678920 ext. 3456
0.0060% +1 234
0.0060% +23 141 567
0.0048% (230)456-7800 #9023
0.0048% +23 141 567 0 8923
0.0048% +1 231.456.7892
0.0048% +231 41 567
0.0048% X-2345
0.0048% +02345 161 789
0.0048% +23 1456 78 92304
0.0048% +23 1 451161 ext 789
0.0048% +23 04 5110 6789-2341
0.0048% +23 045 6780 9123 ext.4005
0.0048% +234 1561 70892
0.0048% +231 4151167891
0.0048% +0234 56789231
0.0048% +234 1 5601700 ext 89020
0.0048% +23 4 567-8923
0.0048% +21 030-4056708
0.0048% + 23 4 567 00 18
0.0048% -1231
0.0048% + 213 451-6708
0.0048% 23-1456-789213
0.0048% 0234 561789
0.0048% +23 10 450678019102
0.0048% + 234 1 5067801
0.0048% +00231415067819
0.0048% + 21 134 567108
0.0048% +23141 567 8901
0.0048% +21 (0)10 304 5678
0.0048% +21 30-4056178
0.0048% +23 4567 8192/3145
0.0048% +23 41 56789213 ext.104
0.0048% +1 203 456 7819/1023
0.0048% +21.10.3004156
0.0048% +2 0341 56780
0.0048% +231 451 678923
0.0048% +21 345 67 0
0.0048% +234 1 567 0892 0345
0.0048% 012034567
0.0048% 002301045061
0.0048% +21-3-45167089
0.0048% + 23 04 516789
0.0048% 23 01 4567 8923
0.0048% +011 23 141 567 8902
0.0048% +23 01 45 67 8912
0.0048% +21 0341-560789
0.0048% +234 567089 ext. 23
0.0048% + 2 03 4516 7892
0.0048% 0
0.0048% + 201 3456789
0.0048% 213-456-7089 Ext 10
0.0036% + 1 231-456-7892
0.0036% +21 10 345678902
0.0036% +231 41 567 89 02-00
0.0036% +23 04 51101 ext. .6781
0.0036% +1 234 567 8923x4501
0.0036% +23 4 567 0891 ext 1213,4516
0.0036% 231--456-1789
0.0036% +23 (0) 40 5678 9123
0.0036% +23 141-567-8902
0.0036% +213 41 560078 ext 1902
0.0036% 231-456-7891.
0.0036% +23 1 45610 ext 7892
0.0036% 213-451-11111
0.0036% +23 04 51101 ext 6107
0.0036% +23 4561 78100 19
0.0036% +23045607
0.0036% +23 41 516 7000 ext 8092
0.0036% 231-456-789002
0.0036% +234 1 5601700 89123
0.0036% +23-141-506 7892
0.0036% + 23 14567800
0.0036% +23 456 789200 34560
0.0036% +23 1 456 10 7892
0.0036% + 23 0141 567 8900
0.0036% +23 4567 xxxx
0.0036% 203-45-6789
0.0036% 23 141 567 8923
0.0036% +234 1 567 8902 ext 3451
0.0036% +23 1 4506000 ext.7892
0.0036% + 23 04 56171890
0.0036% +234 56 78923 ext. 111
0.0036% +213456 78912
0.0036% +21 10 345....
0.0036% 203-456-7189 ext.12
0.0036% +1 231 456 -7892
0.0036% +21 10 300.4156
0.0036% +23 456 78912 3415
0.0036% 23 1 45 67 89 23
0.0036% +23 40 567008-0
0.0036% +23045110 6789
0.0036% +23 4 567 0891ext 1121
0.0036% +00 1234 56
0.0036% +2-03-4516-7819
0.0036% + 21 30 41 15 678
0.0036% +23 1 01 40 51 67 00
0.0036% +02-30456171
0.0036% +21 3 456 70 819
0.0036% 2.34E+15
0.0036% +1 (234) 567 8923
0.0036% +023 04 5110 6781
0.0036% 234-506-789
0.0036% +23 40 5678-0
0.0036% (203)-456-7892
0.0036% +21 10 34 56781
0.0036% +203 456-1718
0.0036% +234 567 8923 ext 456
0.0036% +234 50 67819
0.0036% +234 51 678923 ext 456
0.0036% +234567 8000
0.0036% +23 456 7892000 ext.3405
0.0036% +23 4567 8192-3456
0.0036% +23 45 678- 92030
0.0036% +234 51 671 891
0.0036% +230 4 5116 07108
0.0036% +23 4 516 7801 ext. 9213
0.0036% +23 141 506 7000 ext.8019
0.0036% +23 0 040 5678 9231
0.0036% +23 1 45.67.80.92
0.0036% + 23 1456 71 8902
0.0036% + 21 345 67 8923
0.0036% +23-40-5678 9234
0.0036% +23 1 45 67
0.0036% +23 45 6789123 ext. 100
0.0036% +2130411
0.0036% +23 040 56789234
0.0036% +2 0310 456078
0.0036% 203-456-XXXX
0.0036% +23 1 4560 7892^03^
0.0036% +2 01
0.0024% +23 405678-0923
0.0024% 0021 10 304 5678
0.0024% +23 41 567 0800 ext. 901
0.0024% +231 41 511 16010
0.0024% +23 014560 100
0.0024% +21 3 4500 6001 ext. 7892
0.0024% + 2345
0.0024% +23 141 567 8902 - 3045
0.0024% +21 030 405 6780/9234
0.0024% +2 034 567 89 23
0.0024% +1 234-501 6789
0.0024% +23 141 560 ....
0.0024% 23 1 4567 8101
0.0024% +23 4 5678921 0345
0.0024% +234 5 67892 ext 3456
0.0024% + 1 213 456-1007
0.0024% +23 1456 781921 1341
0.0024% + 23 405678 9234
0.0024% +23 40 516 789-23
0.0024% +23 4 5678 ....
0.0024% +20-3-456-7189
0.0024% 21 3456789234
0.0024% +234 11 5617890 ext 23456
0.0024% +203 456 1700 8923
0.0024% +21.10.345 6780
0.0024% +23.1456.789023
0.0024% 231-456
0.0024% EXT 1213
0.0024% 0023 141 567 8923
0.0024% +23 40 567 89 120
0.0024% +2034 50 1167
0.0024% + 2130456 7811
0.0024% +23 045110 ext. 6789
0.0024% +23-1-450617
0.0024% 234-5-670-0008
0.0024% +23 1 4567800 ext. -9234
0.0024% +231-415116708
0.0024% +23 4 5671 89.23
0.0024% +21311045 607
0.0024% +23-1456-781-923
0.0024% +2 011 34 56 780 9203
0.0024% +21 10 - 345 6071
0.0024% (213) 451- 6078
0.0024% +23 40 1567 892340
0.0024% 234 56 711892
0.0024% + 231 41 5116789
0.0024% +23 4 56 7800
0.0024% 0230 4501 106
0.0024% +23 4 516 780
0.0024% 203-4516-7809
0.0024% +23 40 5678....
0.0024% +20 34 -567892
0.0024% +2 34516 78923
0.0024% +21.10.341.5678
0.0024% 23-1-456-789002
0.0024% +20 34 - 567 892
0.0024% 234-506-789-230
0.0024% +21 - 30 - 405 6789
0.0024% 234-567-8921 x 3456 or 7809
0.0024% +21 30 405 -6789
0.0024% +23 40-5678-9234
0.0024% 2345-6708
0.0024% +2-03-4516789
0.0024% +23 4 56 78 92345
0.0024% + 2341 506 7819
0.0024% +23 45607 892 10
0.0024% 2345678912x11
0.0024% +23 4 567-0891 ext 2134
0.0024% +23 40 56789-0
0.0024% +23 411 561-1789
0.0024% +234 1 5601700 ext. 89200
0.0024% 203) 456-7892
0.0024% +23 40 56 107 890
0.0024% +21 0304156 789
0.0024% + 21 103045678
0.0024% +23-405-6789234
0.0024% +234 511161-789
0.0024% +234 1 567 8902 ext 34
0.0024% +2034-567809
0.0024% +234 51 678191 ext. 2311
0.0024% 234 5 601789
0.0024% +0200-3045670
0.0024% +23-40-56708-921
0.0024% ++ 234 1 5167 809
0.0024% +231 45 6789123-4 ext. 5016
0.0024% 231.456-7892
0.0024% +0021 30 405 6789
0.0024% +23.451.6700118
0.0024% +234 5678 923 456
0.0024% 203451
0.0024% +21 30 450 6789-2034
0.0024% +21 30 4115678-9234567
0.0024% 234 1 5067892
0.0024% +234-1-5067 891
0.0024% +23 4567....
0.0024% +23-10-45067801 ext. 9200
0.0024% + 234 1 506 7891
0.0024% 231-456- 7892
0.0024% +23 10 45067801 9234
0.0024% + 21 3 45 167 892
0.0024% +23 (0) 1456 781923
0.0024% +23 04 5110.1 ext. 16789
0.0024% +23 04-51106789
0.0024% 234-5-678902
0.0024% + 213-456-7892
0.0024% +1 213 456 78923
0.0024% +21 - 30 - 411 56 17
0.0024% +234-1-506-7819
0.0024% +23 04 5110 ext. .6789
0.0024% +23 4 56 78 9
0.0024% 010230456017892
0.0024% + 23 45678923456
0.0024% +20 3-4516 7811
0.0024% +23 141 506 ext.7089
0.0024% 230-456--1789
0.0024% +2 0310
0.0024% +23 045 67809123 ext.4015
0.0024% +23 451 671-1
0.0024% 01234-567892
0.0024% 02-30456117
0.0024% 0234567-8920
0.0024% +23-405-1607 892
0.0024% +21 345 607809 ext. 02345
0.0024% +23 405 1607 0
0.0024% +21-304561789
0.0024% +23 4561 100-0
0.0024% +230405 678 9234
0.0024% +23 1 45678191 ext. 2131
0.0024% 203-456-7811 x -9234
0.0024% +2 01 3041
0.0024% +23 45 6178 92
0.0024% +23 145 67 08 923
0.0024% +23 141 560 7892-3450
0.0024% (203) 456-
0.0024% +21 030-411 5678
0.0024% +23 1456 789230...
0.0024% +21-30-405.6789
0.0024% +23 141 506 ext. 7189
0.0024% +23 4 516 0
0.0024% +21 3456
0.0024% +23 (0)40 5617
0.0024% 234-567-8921 ext 3456
0.0024% +23 4 567 0891 ext
0.0024% +203 456 0 1708
0.0024% +23
0.0024% 234-567-8902 ext103
0.0024% +1
0.0024% (203) 456-7189X 12
0.0024% 213-451-1161 ext 7
0.0024% +234 506708 ext
0.0024% + 23 011 4561708
0.0024% +21 10
0.0024% 2345678011Ext9002
0.0024% 2134567892ex13
0.0012% +23 40 561070 - 819
0.0012% +23 4567-89230
0.0012% +23 141 506 7000 x 8923
0.0012% +1 203-456-1789 X234
0.0012% +23 141 567 ext 0
0.0012% +23 4156 789 - 234
0.0012% +2311 451-6178
0.0012% +231 4567892 ext. -
0.0012% +23 456 789200 345
0.0012% +23 0 0405 678 9234
0.0012% +23 1 405 678923
0.0012% + 23 1 451678
0.0012% 213- 456- 7800
0.0012% +23 04 5110(1) ext. 6718
0.0012% +203 - 45167892
0.0012% +23 1456 789...
0.0012% +231 41 5116...
0.0012% 01 234 560780
0.0012% +20 1 34 56178
0.0012% 0234-567811
0.0012% +23 1 4567 ext. 892
0.0012% +21 30 456 1780/9231
0.0012% 2034567819/2034567809
0.0012% +23 451 ????????
0.0012% 234-567- 8920 x13
0.0012% +23 04 5110 6789-+23 04 50678 9234
0.0012% 2-345
0.0012% +23 4 567 0892 ext345
0.0012% +234 51 678000-9
0.0012% 0200 34156178
0.0012% +1 -213-456-0780
0.0012% +21 0 30 456
0.0012% +231456-718923
0.0012% +1 203-456-7809.
0.0012% +0021 0304 567 8923
0.0012% 213 - 456 - 7819
0.0012% +23 456
0.0012% +23 4567 89-0
0.0012% +23 4 56 78 92
0.0012% 213 456-7108-
0.0012% +23 4 567 8921/+31 45 678 912
0.0012% (234) 567-8912X113
0.0012% +23405678-9230
0.0012% +2 011 34 506 780 9234
0.0012% +1 230 456 78
0.0012% 01.23.45.16.71.
0.0012% +234 5 600 7892 ext. 134
0.0012% (234) 567-8921 x 3456
0.0012% +23 141 567....
0.0012% +1 231 4 561789
0.0012% + 23 40 567 08 910
0.0012% +23 40- 5006 7819
0.0012% + 23 1 +4567800
0.0012% +231 456
0.0012% +23 41- 567892
0.0012% _203 451 6789
0.0012% +231415067000 ext.8923
0.0012% +21 134 56....
0.0012% +234 51 6701111 ext. 1891
0.0012% +23 40 5678 -
0.0012% +21-345-678 192
0.0012% +1 234 501-
0.0012% +0023-40-56789230
0.0012% +203 4 561171 ext 1
0.0012% +203- 4516 7892
0.0012% +1 231 456 0000 ext 111
0.0012% +21 (0)345 678 923
0.0012% +23 0145 678911 ext. 234
0.0012% +011 23 0 405 670 8923
0.0012% +234 5 610789-2
0.0012% +234 567 8923456
0.0012% +23 01456 - 78- 9102
0.0012% +21 134 56
0.0012% +234-1-5067892
0.0012% 0023 40 5678 9231
0.0012% 231 456
0.0012% +203 4156701- 18
0.0012% +23 40 5678-902
0.0012% 23- 145-678-1923
0.0012% +23 (0) 141 567 8092
0.0012% +23 141 567 8923 - 4567 - 8923
0.0012% +21 345 678-923
0.0012% +2 031 0
0.0012% +2-034-5678920
0.0012% 0023145678923
0.0012% +23 45 607-89230
0.0012% 2345678092*103
0.0012% 23 0 400 516 789
0.0012% +23 40 567892 0
0.0012% +20 1 010-3041567
0.0012% +234 5 678 - 9234, 5678 or 9234 ext 56
0.0012% 23-450-6789-213
0.0012% +234 5 610789/2
0.0012% 011-231-45 678-1923
0.0012% +20 1 304151617
0.0012% +23 1 456 01 70 11
0.0012% +00 23 1 45678923
0.0012% +1-231-456 7891
0.0012% +231 415 678 92
0.0012% +23 0 1405 678923
0.0012% +23 4567 89 2345-6789
0.0012% +23- 40 5678 9234
0.0012% + 23 01456 71 8923
0.0012% +1 23 40 5678 9213
0.0012% +01 234 501 xxxx
0.0012% +21-345-67892
0.0012% +1 213 451 1678^09^
0.0012% +2301456 71 8923
0.0012% + 234 5670181 ext.
0.0012% +21 345 - 6 78902
0.0012% (234)567-8912 ext.113
0.0012% + 23 01456 789 234
0.0012% +234 05678923405
0.0012% +21 30456
0.0012% +234 1-5601700 -
0.0012% +21 101-3456-789
0.0012% +1 203 456 7892 Gondor City
0.0012% +21 11 3456178 ext. 9 2 3
0.0012% +20 1 3456 780-1
0.0012% +234 51 670 1111ext. 1892
0.0012% +23 4 56 78 9234 5670
0.0012% 0234 50678 9203
0.0012% +23 01 450678, 092314561
0.0012% +00231 451 678 092
0.0012% +20-34 567892
0.0012% 21 030405 6710
0.0012% +23 40 560 710-0
0.0012% +231 45-67-1892
0.0012% +231 4 567892 ext. 34567
0.0012% +23 4 567.00.80
0.0012% + 234 15 670891
0.0012% + 23 4 511 161 708
0.0012% 0023-45678923
0.0012% +23141 5067892
0.0012% 2134567118;9123415678
0.0012% 23 40 5678 0
0.0012% 010 230456017892
0.0012% +23 40 56789-0 Ext. 2 311
0.0012% + 23 04 51 678 9234
0.0012% +21-30.4567081
0.0012% 234-567-0892 (Pager)
0.0012% +23 45678923 4567
0.0012% +23.04.5110.6789
0.0012% +21 - 3 - 45 16 71 80
0.0012% + 23456078
0.0012% +01 234-5678
0.0012% ++234 1 5167 809
0.0012% +21 - 3- 11045678
0.0012% Sorgum 010 200 3(101)
0.0012% +23 1456 789230 ext 4105
0.0012% 21 10 341 5670
0.0012% + 234 506 78 9234
0.0012% 230/456/7819
0.0012% +201 3 40506 ext. 0708
0.0012% Folley: 231-4516
0.0012% 00231 4 5670892
0.0012% +23(0)1415678923
0.0012% +23 1 45 60 - 78 90
0.0012% +234 56-7892
0.0012% 1123,1145
0.0012% 12134516789Ext213
0.0012% 200-345-6780x9023
0.0012% +1 (203)456 7809
0.0012% 2345 ou 6789
0.0012% +23 45601 789 0
0.0012% 0021 0 30411 5678
0.0012% +21 11 345 6178-9-2
0.0012% +2 0345 0 67892
0.0012% +1 234-567 1891 x2341
0.0012% + 21304115678
0.0012% +23 40- 567 08 923
0.0012% +23 1 45 67 800 ext. 923
0.0012% +234 5607....
0.0012% +23 4056789 0
0.0012% 234.567.8902x103
0.0012% +21 - 30 - 456-7189
0.0012% +23-405678-9234
0.0012% + 203 4516-7810
0.0012% +23-4567-8923
0.0012% +23 0 141 5678923
0.0012% +231 4516 718
0.0012% +21(0)30 456 1789
0.0012% +21(0)3045 67890
0.0012% +201 34 51 670
0.0012% +23 40 56789 0
0.0012% 234 506 789 213
0.0012% 010-23045601
0.0012% +002310 45067801-9213
0.0012% +23-405-678 ext. 9012
0.0012% 234/567-8912ext34
0.0012% +23 041 56789234-0
0.0012% +23 4561 7 892
0.0012% +23 0456718-0
0.0012% +23-140115
0.0012% 00 234 511161708
0.0012% +23 4156 789 02
0.0012% +23 4 05 67892300
0.0012% +21345-678192
0.0012% +234 567 892 340 5
0.0012% +23 0405678 9234
0.0012% +230 45 678-9231
0.0012% +234 1 5678190-23 Ext. 4005
0.0012% +23 41 567891/2
0.0012% + 234567800
0.0012% +0023 0141-567-8912
0.0012% +20.3. 4516780
0.0012% 2345670890 x 12
0.0012% +234 (0)1 5678902
0.0012% +23 4567 890-2345
0.0012% +1213-456-7189
0.0012% +23 40 56107 08920
0.0012% +23 1 456 10 ext 7892
0.0012% +230 4 560 78 923
0.0012% + 23 1 45 67 800
0.0012% +1 213-4567181
0.0012% +00213 041 56789
0.0012% +234 5678923 405
0.0012% 230 456 7800 ext.923
0.0012% +21 30 45
0.0012% 213-451-6789 / 231-456-7892
0.0012% +2311 451 - 6171
0.0012% 234 561 78923
0.0012% +23 40 5678923 ext.1415
0.0012% +23 04 50678.912
0.0012% 234 567 8923 ext. 405
0.0012% +234-51-6789234
0.0012% +2-034-567-89-21
0.0012% +21 345 67 8923-1
0.0012% +23 1 415 67809 2345
0.0012% +23 4 567 89 21 ext. 345
0.0012% + 23 0145678923
0.0012% 02030456789HAL
0.0012% 234-567-8111 x.111
0.0012% +23 1456 78 9 234
0.0012% +23 04 506789.1
0.0012% 234.516.7080 x923
0.0012% +2 03 4516-7892
0.0012% +234 51 670 1111 ext. 1890
0.0012% +23 04 506789..
0.0012% + 23 1 451678923
0.0012% +20 1 3456...
0.0012% +21 345 678 910 2
0.0012% +234 1 56781902 3405
0.0012% +21 10 300 4156 7118
0.0012% 203.450678
0.0012% +23 4567xxxx
0.0012% +2 01 3456 178
0.0012% +2 314 567 1819 ext. 23
0.0012% +2 034 56 7189
0.0012% 01 234 510 670
0.0012% +234 1 56 78092
0.0012% +23 40560017-0
0.0012% +23 1 4560700 ext. 89
0.0012% + 2 03 451 6789
0.0012% +21-345 678 923
0.0012% +23-1-40115
0.0012% +23 451 67892-0
0.0012% 234.567.8921 x3456
0.0012% +1 23 45 670 890
0.0012% +23.01.45.67.89.21
0.0012% 1-203-456-01781
0.0012% +23 4 5607892-31
0.0012% +23 0451101-6781
0.0012% +23 141 567 8192 - 3405 - 678 9123
0.0012% +234 50 67 89
0.0012% + 23 415678920 ext. 3456
0.0012% +23.4567.89020
0.0012% +23 450 - 678 09
0.0012% +231045067801--9231
0.0012% +203 456 78923/40
0.0012% +21-3-4500-6017
0.0012% +23 1 4567 ext 8923
0.0012% 23 45 678 000
0.0012% 234-567-8902-103
0.0012% +23 4567.8923
0.0012% +23 045 6780 9123ext. 4105
0.0012% +20 10 3456700 - 1
0.0012% 234-567-8923x405
0.0012% +21 345 6-78923
0.0012% +203 41506 789
0.0012% +23 0141-560-7892
0.0012% + 23 0 4056789234
0.0012% +23 11 4567 0892 ext 103
0.0012% 011-23-1456-789001
0.0012% (234) 51 67 89 12
0.0012% +1 213 456 7000 ext 8902 (Baz)
0.0012% 00230145678923
0.0012% 230-456-7891, x-23.
0.0012% +23.405 678 9234
0.0012% +1 234 567 0891 ext-213
0.0012% +230-415-678921
0.0012% +23 4156 789 2-1
0.0012% +23 04506 7892-13
0.0012% (0)1234 567892
0.0012% +23 04 5110 . 6718
0.0012% +2034516 ext 7892
0.0012% +23 4 567 18920
0.0012% 234 05 00
0.0012% +234 0506 78 9213
0.0012% +20 3-4516071
0.0012% +23 40 56708 9-123
0.0012% +23 4567 89 -2345
0.0012% +203 4156701 - 08
0.0012% 23 0141 560 1789
0.0012% +231 4567892 ext -
0.0012% +1 234 567 8912 1-304-567-8092
0.0012% +1 23 45 670 8901
0.0012% +23 (0)-45-67-08-92
0.0012% +201 3 40506 ext 789
0.0012% 234-
0.0012% +2034516-7892
0.0012% +23 04 5110.1 ext.6789
0.0012% +23 (0)4561 178 1921
0.0012% +1 231 -456 7891
0.0012% +21-30-4156 789
0.0012% 012 345678
0.0012% +234 56 789200-13
0.0012% +234 1 5678190-2 Ext. 3405
0.0012% 0200 345 678
0.0012% +2 03-451 6789
0.0012% +21 30 405 67892 ext 3456
0.0012% +0023 41567 89231
0.0012% +23 40 5 6 07 10 80
0.0012% 23 01456 789 200
0.0012% +1 203 456 7819-2345
0.0012% + 23
0.0012% +2340-5678-9234
0.0012% + 011 23 1456 7892 31
0.0012% (23) 456-789-2345
0.0012% +21 3 4567 89 0
0.0012% +2340-5678 9234
0.0012% 2345-
0.0012% 0203045-6789, -2345
0.0012% + 23 4 56 78
0.0012% +23 04 506789.1.
0.0012% +23 04 506789.23
0.0012% +230405678 9234
0.0012% +203 456 1700 -892
0.0012% +23 04 5110.1
0.0012% +1 203 456 7
0.0012% +2 314 50 67 089
0.0012% (234) 516-000
0.0012% + 234 567 0891 ext. 2345
0.0012% +23-451-60078-19
0.0012% +23 01456 78 92 34
0.0012% 2034567890/EXT112
0.0012% +23 045-67809234
0.0012% +2-03-45167891
0.0012% +23 41 5678...
0.0012% +23 1 4516....
0.0012% +234 56 789.
0.0012% +23 45678921 ext 103
0.0012% +23 456 78 92 03 / 41 5670 819
0.0012% 0021 (0) 30 456 1781
0.0012% +21.30.405.67.80
0.0012% +23 0 4056789023
0.0012% +21 34 5678 ext 9234
0.0012% +1 213 456- 7892
0.0012% 00234 5 678912
0.0012% +21 (0)30 405 6780/9234
0.0012% +23 4567 8091\\23
0.0012% +2301456 718920
0.0012% 234-567-0890ext12
0.0012% +23 01045067801 ext.9021
0.0012% 203-456-7811x 9230
0.0012% +23-40-516789-12
0.0012% 210-345-6170q
0.0012% +23 45.67.80.00
0.0012% +23 1 451161 ext708
0.0012% +0 1 231-456-7892
0.0012% 231 4567-8923
0.0012% + 234 567 0891 ext 1213
0.0012% + 23 01405 67890
0.0012% +234567 89 2345
0.0012% 2130456 7892
0.0012% +20 34 - 567892
0.0012% +1 213 456 ????
0.0012% 213-405-1678 pgr
0.0012% 12134567108ext92
0.0012% +23 04 5110.1 ext. 6789
0.0012% +1 EMAIL ONLY
0.0012% +23 4516-780
0.0012% 213-450-6708E x 19
0.0012% +23.04.5110. 6789
0.0012% +23 04567 189 2345
0.0012% [213]4567811
0.0012% +2-034 567 8920
0.0012% +2310-45067801-9231
0.0012% +1 23 4 567 8092
0.0012% 213.-451-6789
0.0012% +23-10-4506-7892 ext. 3045
0.0012% +23 41 5678901-2
0.0012% + 21-30-4567800
0.0012% (234) 567-8192 EXT 3
0.0012% +234 567 8900 ext. 234
0.0012% +23 145.67.89.02
0.0012% +23 4567 8920\\31\\45
0.0012% +23 4 5670891 ext.
0.0012% 231- 456-
0.0012% 213-451-6000x7
0.0012% 21-345-67-8923
0.0012% 23 04 51 678 9234
0.0012% +23 4 5678921 ext 345
0.0012% + 2340 5678 9234
0.0012% +20 1 3045100 ext 67
0.0012% +2 03-4567 8923
0.0012% "+1 213 456 7892"
0.0012% + 00 23 1456 718 092
0.0012% +21 30 - 456 78 92
0.0012% +234 567891 ext. 21-3
0.0012% 23
0.0012% 203-456-7800x92304
0.0012% 23 1456 781923
0.0012% 231 41 5116781
0.0012% +1 213 456 XXXX
0.0012% 231 456-0789 ext. 10
0.0012% 01-23-45-60-78
0.0012% +23 - 405 - 678 9123
0.0012% 213-451-1161 ext. 7
0.0012% +23 4567- 819230
0.0012% +23 141 560 7089 <bk>
0.0012% ++ 20 3456
0.0012% 2-34 567892
0.0012% 23-040-5678-9023
0.0012% +011-23-45-670-0892
0.0012% +234 5 67809023 ext.451
0.0012% +23 45 671 89-2
0.0012% +23 4561 789 23 4
0.0012% 231- 456-0789 12
0.0012% 0021 10 34 56 718
0.0012% (213) 451-6789 ext. 230
0.0012% +23 4156 789-231
0.0012% '+21 30 405 6789
0.0012% + 2 034 567181
0.0012% +20-3-4516 7892
0.0012% +234 5 - 6780 9023
0.0012% +23 4151 6178-10
0.0012% ++23 141 567 8019
0.0012% +21xx xxx xxxx
0.0012% +234 156 780-092
0.0012% +213456-78912
0.0012% +1 23456789023
0.0012% +23 04 500161 ext. 789
0.0012% +203- 456 7189
0.0012% +23141-567 8923
0.0012% ++23-0451106789
0.0012% 213-451-0001-
0.0012% +21 30 411 ....
0.0012% +23 405 160718 0
0.0012% +23 - 04 - 50678 910
0.0012% 1-200-340-5678 ext-912
0.0012% 0021-10-300 4567
0.0012% +23 45 67.81.92
0.0012% +0023 04 5110 6708
0.0012% +23-0-456789231
0.0012% +234 1 5601700-18
0.0012% +23.45.67.89.23
0.0012% +21-345-6 78110
0.0012% +23 45 67 089
0.0012% +20 314 561 7000 8192
0.0012% 2314 5601 7809
0.0012% +20-34- 567800
0.0012% +23 451 678 9234, 5678
0.0012% +23 4156 78912 0
0.0012% (203) 456-7891 EXT.12
0.0012% (213 ) 456-7819
0.0012% 2034567892Hobson/3456789203/PogI
0.0012% +23.1.45.67.89.23
0.0012% +23 4567 89 -234
0.0012% +234 5
0.0012% +23 4-567 8100
0.0012% +23 0141- 567-8923
0.0012% +23 1456 78....
0.0012% +201 3 40506
0.0012% +23 456789231 4567
0.0012% +23 - 4105 -617 - 189
0.0012% + 21010 341 5016
0.0012% +0021134 56 10 78
0.0012% +1 231- 456 -7189
0.0012% 203-456-7809 - Bobco
0.0012% +20 34 56 7892 1
0.0012% +23 45 6789200 301
0.0012% +23 1456 789 23456
0.0012% Ext. 2310
0.0012% +23 1 45678
0.0012% +21 3 4567 1 1
0.0012% +23.40.56789234
0.0012% +2-1304
0.0012% +23 40 567892 3456
0.0012% +2340-56789234
0.0012% +23-040-560789-21
0.0012% +23 41567189 21345000
0.0012% +011 23 141 567-8901
0.0012% +23 040 5678-9234
0.0012% +21 345 607809 ext 02345
0.0012% 234 51670018
0.0012% +2 010 3456 789
0.0012% +23 456789 231ext. 4567
0.0012% +21 30 411 5678-9
0.0012% +2 03-4516-7892
0.0012% 21 10 341
0.0012% + 20 345 678 92 00
0.0012% (pager) 203-401-5678
0.0012% +23 4567 89 0
0.0012% (0121) 3456781
0.0012% +23-45678-92310
0.0012% +203 450 6111 ext 78
0.0012% + 23 4 567 0891 ext 1234
0.0012% +23 4 56 11 ext. 7080
0.0012% 23 040 5678 9234
0.0012% 213/ 456-7892
0.0012% +00 21 30 411- 5671
0.0012% +23 40 5678..9234..
0.0012% +23 4567 8923-40567
0.0012% +1 231- 456- 7892
0.0012% + 23 1 451161 Ext 789
0.0012% +203- 4516789
0.0012% +234 5 06 78 9234
0.0012% 231 456- 7819
0.0012% +23 141 560 7800ext 9230
0.0012% +1 23 45 1167 89
0.0012% +20 3 4516789 2
0.0012% +23 45 607 89 2
0.0012% 0231 0101 45
0.0012% (213456-7892
0.0012% 0231 45670
0.0012% +-23 -405-6789234
0.0012% +230 4 56 078 923
0.0012% + 23 01456 789002
0.0012% +23 - 4567 - 81 923
0.0012% 234-567-8921 x
0.0012% 1-200-345-6780x 9023
0.0012% 020 31 14500
0.0012% ext:2-3104
0.0012% +23.04.5110.1
0.0012% 011-23-41-567891
0.0012% +23 045 67809123 ext. 4110
0.0012% +23-0141-567-8921
0.0012% +21 0 30 405 67 89
0.0012% + 23 (0) 405 678 1912
0.0012% + 21 10 341
0.0012% +23 41 5670 0 892
0.0012% + 203 - 451-6789
0.0012% 23 45 6700000
0.0012% +02-341 5678
0.0012% +23-4-567 8921
0.0012% 2 3456 708921
0.0012% +234567892 13
0.0012% +23-45-61-78-92
0.0012% +1 (203) xxx xxxx
0.0012% 23 01456 781923
0.0012% +1 234 567 8902-110
0.0012% +2110 34 56 718
0.0012% +23 11 451 - 6789
0.0012% +23 04 51101.6718
0.0012% 21-30-41-11-506
0.0012% +21 345 67 89231 4
0.0012% .213.451.6178
0.0012% +234 1-5601700
0.0012% 00 23 1 45 67 80 19
0.0012% +234 567 89 23456
0.0012% +21 10 134 561078
0.0012% +2310450678019231
0.0012% +20.3.451-6789
0.0012% +21 01-34-56-78-92
0.0012% +23 4156 0
0.0012% +23 456789 234 ext. 5617
0.0012% +23 1 4567ext. 8009
0.0012% +23 (0) 40 - 56 78 91 20
0.0012% +1 213 456 7089 (x203)
0.0012% + 23 0 141 567 8923
0.0012% +23 4 5670891 ext
0.0012% 02.34.56.78.19
0.0012% + 2-3456-78-9102
0.0012% 1-200-340-5678 ext.912
0.0012% +23 40 506 1708 921
0.0012% + 23 40 516789-23
0.0012% +234 51 11 61 ext. 780
0.0012% +23 40 56789- 234
0.0012% +2- 3 4516 7809
0.0012% +2310 45067801-9234
0.0012% +23 1 4567800 923
0.0012% + 23 1 45610 ext. 7891
0.0012% +23 45607892 ext. 03401
0.0012% + 0023145671809
0.0012% 00 234516700809
0.0012% +2345 6007892 ext
0.0012% + 21 30 4156 780
0.0012% +23 -141-567-8091
0.0012% +234 5 67892 3415
0.0012% + 23 1456 718923 ext 4150
0.0012% +23 1456 789211 ....
0.0012% +23 4 5678923 ext -456
0.0012% +23 1456-780-923
0.0012% 213-451-6789 xt. 213
0.0012% +23 40 561 07 - 892
0.0012% +234-51-671891
0.0012% +23 4 567 0891 ext.
0.0012% +23 4 56078 91 23
0.0012% 203-456-7180 x-902
0.0012% +23 45 6789 2034 -1156
0.0012% +23 451 67892345/67890234/56789234 ext.1510
0.0012% +23 041 516101718
0.0012% + 02345 607 892
0.0012% +203 4 561171-89
0.0012% +21-(0)30-4111506
0.0012% +23 40 506 7 -8 923
0.0012% +1 234 567 8090 x 234
0.0012% +1 203 456 7892 341 567 8921
0.0012% 234 56
0.0012% +21134 56 7892
0.0012% +2130 4156 780
0.0012% 234-5167-Hexnet
0.0012% + 231 4 5067 892
0.0012% +23 410 56107 892
0.0012% +21 300 - 4056780
0.0012% +23 040 56 00 78 92
0.0012% 0213 410 0
0.0012% +23-40-5678-912
0.0012% + 21 0 10 341 5016
0.0012% +234 506 78 -9230
0.0012% 203-456-7892????????????????????
0.0012% + 23 4567 89 234
0.0012% +23 405 ?????????
0.0012% 011 21 34 567 8923
0.0012% +0121 3456000 EX 0718
0.0012% +23 45678 ?????
0.0012% +234 5 678 0-91
0.0012% +2 0
0.0012% +23 40 5678 ....
0.0012% +23 415 678 900 2340
0.0012% 213 456 7890 x2304
0.0012% +231 4 56789 110 ext. 112
0.0012% 2310 450678019234
0.0012% +21 341-5-67809
|
Re: Beast of the Number: Parsing the Feral Phone
by demerphq (Chancellor) on Apr 17, 2002 at 16:21 UTC
|
Big time ++ dude!
Couple of quickie comments before I start trying to run your code against the 10 million german CLI(call line identifiers) that I have access to and the 100k or so UK numbers that are on hand as well.
Regarding parsing extensions. In some countries (like Germany) you arent allowed to have extensions. I believe this is due to the authorities needing to be able to uniquely identifiy the locaion of every handset in the country. This of course means that if you can find the list of countries that have such a law you can simplify the logic of parsing out extensions.
Regarding number formats, I believe that you can take advantage of the +1 code. All of these numbers are in a 3-3-4 pattern (with optional extension). These should be easy to parse. OTOH Germany uses a floating format (anywhere for 6 digits (maybe smaller!) for a local number to a full blown 14 digit (including +, country code and area code) for my own phone number (they can get larger).
Which brings me to area codes. These are/should be easy to parse in the +1 area. But theres no way to do so in a country that uses floating length area codes (like Germany with 2-5 digit area codes) short of knowing the full list for that country. Of course thats not real feasable considering that Germany alone has 5226 of them... (I know I converted the DTAG list into the AOC data used on our switches...) (Actually ive always thought it interesting that Germany has so many, but the entire NA uses less than a thousand. I guess thats why extensions are so common in NA, in order to work around the (currently) antiquitated telecoms industry that is the result of NA's early lead in the area)
Anyway, these are just quick of the cuff comments. A node this big and serious will need a lot more time for thought.
Big ++ once again!
O btw, heres a list of the German area codes in ranged form. (ie 2051-2054 means 2051, 2052, 2053, 2054)
:-)
<super>
my @zones=qw( 201-203 2041 2043 2045 2051-2054 2056 2058 2064-2066 208-209 2102-2104 211 2120-2129 2131-2133
2137 214 2150-2154 2156-2159 2161-2166 2171 2173-2175 2181-2183 2191-2193 2195-2196 2202-2208 221 2222-2228
2232-2238 2241-2248 2251-2257 2261-2269 2271-2275 228 2291-2297 2301-2309 231 2323-2325 2327 2330-2339 234 2351-2355
2357-2369 2371-2375 2377-2379 2381-2385 2387-2389 2391-2395 2401-2409 241 2421-2429 2431-2436 2440-2441 2443-2449
2451-2456 2461-2465 2471-2474 2482 2484-2486 2501-2502 2504-2509 251 2520-2529 2532-2536 2538 2541-2543 2545-2548
2551-2558 2561-2568 2571-2575 2581-2588 2590-2599 2601-2608 261 2620-2628 2630-2639 2641-2647 2651-2657 2661-2664
2666-2667 2671-2678 2680-2689 2691-2697 271 2721-2725 2732-2739 2741-2745 2747 2750-2755 2758-2759 2761-2764
2770-2779 2801-2804 281 2821-2828 2831-2839 2841-2845 2850-2853 2855-2859 2861-2867 2871-2874 2902-2905 291 2921-2925
2927-2928 2931-2935 2937-2938 2941-2945 2947-2948 2951-2955 2957-2958 2961-2964 2971-2975 2977 2981-2985 2991-2994
30 3301-3304 33051 33053-33056 3306-3307 33080 33082-33089 33093-33094 331 33200-33209 3321-3322 33230-33235
33237-33239 3327-3329 3331-3332 33331-33338 3334-3335 33361-33369 3337-3338 33393-33398 3341-3342 33432-33439 3344
33451-33452 33454 33456-33458 3346 33470 33472-33479 335 33601-33609 3361-3362 33631-33638 3364 33652-33657 3366
33671-33679 33701-33704 33708 3371-3372 33731-33734 33741-33748 3375 33760 33762-33769 3377-3379 3381-3382 33830-33839
33841 33843-33849 3385-3386 33870 33872-33878 3391 33920-33926 33928-33929 33931-33933 3394-3395 33962-33979 33981-33984
33986 33989 340-341 34202-34208 3421 34221-34224 3423 34241-34244 3425 34261-34263 34291-34299 3431 34321-34322
34324-34325 34327-34328 3433 34341-34348 3435 34361-34364 3437 34381-34386 3441 34422-34426 3443 34441 34443-34446
3445 34461-34467 3447-3448 34491-34498 345 34600-34607 34609 3461-3462 34632-34633 34635-34639 3464 34651-34654
34656 34658-34659 3466 34671-34673 34691-34692 3471 34721-34722 3473 34741-34743 34745-34746 3475-3476 34771-34776
34779 34781-34783 34785 34901 34903-34907 34909 3491 34920-34929 3493-3494 34953-34956 3496 34973 34975-34979
3501 35020-35028 35032-35033 3504 35052-35058 351 35200-35209 3521-3523 35240-35249 3525 35263-35268 3528-3529
3531 35322-35327 35329 3533 35341-35343 3535 35361-35365 3537 35383-35389 3541-3542 35433-35436 35439 3544 35451-35456
3546 35471-35478 355 35600-35609 3561-3564 35691-35698 3571 35722-35728 3573-3574 35751-35756 3576 35771-35775
3578 35792-35793 35795-35797 3581 35820 35822-35823 35825-35829 3583 35841-35844 3585-3586 35872-35877 3588 35891-35895
3591-3592 35930-35939 3594 35951-35955 3596 35971 35973-35975 3601 36020-36029 3603 36041-36043 3605-3606 36071-36072
36074-36077 36081-36085 36087 361 36200-36209 3621-3624 36252-36259 3628-3629 3631-3632 36330-36338 3634-3636
36370-36379 3641 36421-36428 3643-3644 36450-36454 36458-36459 36461-36465 3647 36481-36484 365 36601-36608 3661
36621-36626 36628 3663 36640 36642-36649 36651-36653 36691-36695 36701-36705 3671-3672 36730-36739 36741-36744
3675 36761-36762 36764 36766 3677 36781-36785 3679 3681-3683 36840-36849 3685-3686 36870-36871 36873-36875 36878
3691 36920-36929 3693 36940-36941 36943-36949 3695 36961-36969 371 37200 37202-37204 37206-37209 3721-3727 37291-37298
3731 37320-37329 3733 37341-37344 37346-37349 3735 37360-37369 3737 37381-37384 3741 37421-37423 37430-37439
3744-3745 37462-37465 37467-37468 375 37600-37609 3761-3765 3771-3774 37752 37754-37757 381 38201-38209 3821
38220-38229 38231-38234 38292-38297 38300-38309 3831 38320-38328 38331-38334 3834 38351-38356 3836 38370-38379
3838 38391-38393 3841 38422-38429 3843-3844 38450-38459 38461-38462 38464 38466 3847 38481-38486 38488 385 3860-3861
3863 3865-3869 3871 38720-38729 38731-38733 38735-38738 3874 38750-38759 3876-3877 38780-38785 38787-38789 38791-38794
38796-38797 3881 38821-38828 3883 38841-38845 38847-38848 38850-38856 38858-38859 3886 38871-38876 39000-39009
3901-3902 39030-39039 3904 39050-39059 39061-39062 3907 39080-39089 3909 391 39200-39209 3921 39221-39226 3923
39241-39248 3925 39262-39268 3928 39291-39298 3931 39320-39325 39327-39329 3933 39341-39349 3935 39361-39366
3937 39382-39384 39386-39409 3941 39421-39428 3943-3944 39451-39459 3946-3947 39481-39485 39487-39489 3949 395
39600-39608 3961-3969 3971 39721-39724 39726-39728 3973 39740-39749 39751-39754 3976 39771-39779 3981 39820-39829
39831-39833 3984 39851-39859 39861-39863 3987 39881-39889 3991 39921-39929 39931-39934 3994 39951-39957 39959
3996 39971-39973 39975-39978 3998 39991-39999 40 4101-4109 4120-4129 4131-4144 4146 4148-4149 4151-4156 4158-4159
4161-4169 4171-4189 4191-4195 4202-4209 421 4221-4224 4230-4249 4251-4258 4260-4269 4271-4277 4281-4289 4292-4298
4302-4303 4305 4307-4308 431 4320-4324 4326-4340 4342-4344 4346-4349 4351-4358 4361-4367 4371-4372 4381-4385
4392-4394 4401-4409 441 4421-4423 4425-4426 4431-4435 4441-4447 4451-4456 4458 4461-4469 4471-4475 4477-4489
4491-4499 4501-4506 4508-4509 451 4521-4529 4531-4537 4539 4541-4547 4550-4559 4561-4564 4602-4609 461 4621-4627
4630-4639 4641-4644 4646 4651 4661-4668 4671-4674 4681-4684 4702-4708 471 4721-4725 4731-4737 4740-4749 4751-4758
4761-4779 4791-4796 4802-4806 481 4821-4830 4832-4839 4841-4849 4851-4859 4861-4865 4871-4877 4881-4885 4892-4893
4902-4903 491 4920-4929 4931-4936 4938-4939 4941-4948 4950-4959 4961-4968 4971-4977 5021-5028 5031-5037 5041-5045
5051-5056 5060 5062-5069 5071-5074 5082-5086 5101-5103 5105 5108-5109 511 5121 5123 5126-5132 5135-5139 5141-5149
5151-5159 5161-5168 5171-5177 5181-5187 5190-5199 5201-5209 521 5221-5226 5228 5231-5238 5241-5242 5244-5248
5250-5255 5257-5259 5261-5266 5271-5278 5281-5286 5292-5295 5300-5309 531 5320-5329 5331-5337 5339 5341 5344-5347
5351-5358 5361-5368 5371-5379 5381-5384 5401-5407 5409 541 5421-5429 5431-5439 5441-5448 5451-5459 5461-5462
5464-5468 5471-5476 5481-5485 5491-5495 5502-5509 551 5520-5525 5527-5529 5531-5536 5541-5546 5551-5556 5561-5565
5571-5574 5582-5586 5592-5594 5601-5609 561 5621-5626 5631-5636 5641-5648 5650-5659 5661-5665 5671-5677 5681-5686
5691-5696 5702-5707 571 5721-5726 5731-5734 5741-5746 5751-5755 5761 5763-5769 5771-5777 5802-5808 581 5820-5829
5831-5846 5848-5855 5857-5859 5861-5865 5872-5875 5882-5883 5901-5909 591 5921-5926 5931-5937 5939 5941-5948
5951-5957 5961-5966 5971 5973 5975-5978 6002-6004 6007-6008 6020-6024 6026-6029 6031-6036 6039 6041-6059 6061-6063
6066 6068 6071 6073-6074 6078 6081-6087 6092-6096 6101-6109 611 6120 6122-6124 6126-6136 6138-6139 6142 6144-6147
6150-6152 6154-6155 6157-6159 6161-6167 6171-6175 6181-6188 6190 6192 6195-6196 6198 6201-6207 6209 6211-6218
62190-62199 6220-6224 6226-6229 6231-6239 6241-6247 6249 6251-6258 6261-6269 6271-6272 6274-6276 6281-6287 6291-6298
6301-6308 631 6321-6329 6331-6349 6351-6353 6355-6359 6361-6364 6371-6375 6381-6387 6391-6398 6400-6409 641 6420-6436
6438-6447 6449 6451-6458 6461-6462 6464-6468 6471-6479 6482-6486 6500-6509 651 6522-6527 6531-6536 6541-6545
6550-6559 6561-6569 6571-6575 6578 6580-6589 6591-6597 6599 661 6620-6631 6633-6639 6641-6648 6650-6661 6663-6670
6672-6678 6681-6684 6691-6698 6701 6703-6704 6706-6709 671 6721-6728 6731-6737 6741-6747 6751-6758 6761-6766
6771-6776 6781-6789 6802-6806 6809 681 6821 6824-6827 6831-6838 6841-6844 6848-6849 6851-6858 6861 6864-6869
6871-6876 6881 6887-6888 6893-6894 6897-6898 69 7021-7026 7031-7034 7041-7046 7051-7056 7062-7063 7066 7071-7073
7081-7085 711 7121-7136 7138-7139 7141-7148 7150-7154 7156-7159 7161-7166 7171-7176 7181-7184 7191-7195 7202-7204
721 7220-7229 7231-7237 7240 7242-7269 7271-7277 7300 7302-7309 731 7321-7329 7331-7337 7340 7343-7348 7351-7358
7361-7367 7371 7373-7376 7381-7389 7391-7395 7402-7404 741 7420 7422-7429 7431-7436 7440-7449 7451-7459 7461-7467
7471-7478 7482-7486 7502-7506 751 7520 7522 7524-7525 7527-7529 7531-7534 7541-7546 7551-7558 7561-7579 7581-7587
7602 761 7620-7629 7631-7636 7641-7646 7651-7657 7660-7669 7671-7676 7681-7685 7702-7709 771 7720-7729 7731-7736
7738-7739 7741-7748 7751 7753-7755 7761-7765 7771 7773-7775 7777 7802-7808 781 7821-7826 7831-7839 7841-7844
7851-7854 7903-7907 791 7930-7955 7957-7959 7961-7967 7971-7977 8020-8029 8031-8036 8038-8039 8041-8043 8045-8046
8051-8057 8061-8067 8071-8076 8081-8086 8091-8095 8102 8104-8106 811 8121-8124 8131 8133-8139 8141-8146 8151-8153
8157-8158 8161 8165-8168 8170-8171 8176-8179 8191-8196 8202-8208 821 8221-8226 8230-8234 8236-8239 8241 8243
8245-8254 8257-8259 8261-8263 8265-8269 8271-8274 8276 8281-8285 8291-8296 8302-8304 8306 831 8320-8338 8340-8349
8361-8370 8372-8389 8392-8395 8402-8407 841 8421-8424 8426-8427 8431-8435 8441-8446 8450 8452-8454 8456-8469
8501-8507 8509 851 8531-8538 8541-8558 8561-8565 8571-8574 8581-8586 8591-8593 861 8621-8624 8628-8631 8633-8642
8649-8652 8654 8656-8657 8661-8667 8669-8671 8677-8679 8681-8687 8702-8709 871 8721-8728 8731-8735 8741-8745
8751-8754 8756 8761-8762 8764-8766 8771-8774 8781-8785 8801-8803 8805-8809 881 8821-8825 8841 8845-8847 8851
8856-8858 8860-8862 8867-8869 89 906 9070-9078 9080-9094 9097 9099 9101-9107 911 9120 9122-9123 9126-9129 9131-9135
9141-9149 9151-9158 9161-9167 9170-9199 9201-9209 921 9220-9223 9225 9227-9229 9231-9236 9238 9241-9246 9251-9257
9260-9289 9292-9295 9302-9303 9305-9307 931 9321 9323-9326 9331-9360 9363-9367 9369 9371-9378 9381-9386 9391-9398
9401-9409 941 9420-9424 9426-9429 9431 9433-9436 9438-9439 9441-9448 9451-9454 9461-9469 9471-9474 9480-9482
9484 9491-9493 9495 9497-9499 9502-9505 951 9521-9529 9531-9536 9542-9549 9551-9556 9560-9569 9571-9576 9602-9608
961 9621-9622 9624-9628 9631-9639 9641-9648 9651-9659 9661-9666 9671-9677 9681-9683 9701 9704 9708 971 9720-9729
9732-9738 9741-9742 9744-9749 9761-9766 9771-9779 9802-9805 981 9820 9822-9829 9831-9837 9841-9848 9851-9857
9861 9865 9867-9869 9871-9876 9901 9903-9908 991 9920-9929 9931-9933 9935-9938 9941-9948 9951-9956 9961-9966
9971-9978 );
</super>
Yves / DeMerphq
---
Writing a good benchmark isnt as easy as it might look. | [reply] |
|
Interesting data regarding Germany -- I had no idea.
It was for precisely this sort of reason, however, that I made no attempt to try and figure out area codes for numbers from various countries around the world. The result of this parsing could easily be passed along to a country-specific module for more appropriate parsing and beautification.
There are a couple of things to point out here that I did not mention in the article (due to 64k limit on nodes). I made no attempt to parse valid IDD prefixes even though lists for each country are available on the net. The reason is that the IDD prefixes are not mutually exclusive to the Country Codes. Nor, unfortunately or unexpectedly, are area codes for a particular locale.
This reality produces ambiguous areas where I could be slurping up an area code or IDD as a country code. What's needed in *that* case is some concept of the natural phone number length for that locality. Rather than get that specific, though, I relied on a threshold length and size percentages as measured against the remainder of the number. It's not perfect, but for my data set it worked suprisingly well.
Given your information about the variability of German numbers, particularly the 14-digit monsters, this technique might fail if the area/province codes happen to match valid country codes elsewhere. This of course only applies to numbers that are presented *without* their Country Code.
Once this code has its hands on what it thinks is the local number, it's just stored as a single number. I chunk it for display purposes, but generically and in a U.S.-centric kind of way: 4 digits on the suffix, preceded by groups of three digits as long as there are digits left.
Also, keep in mind that this code is intended to operate on raw, unrestricted data fields. Typoes, blippoes (???? or xxxx) and all of it are present in the data. There's not a whole lot you can do in these cases to pull out a valid number without knowing in excruciating detail the particulars of the intended country.
GIGO, GIGO, it's off to the dumpster we go!
BTW, I suspect this code might take quite a while to run on 10 million numbers, even if they are well-behaved.
Thanks for the comments and feedback. Any thoughts on whether any of this should be CPAN-bound once cleaned up? (new names and POD, obviously, but beyond that...)
Matt
| [reply] |
Re: Beast of the Number: Parsing the Feral Phone
by strat (Canon) on Apr 17, 2002 at 14:26 UTC
|
I've just seen a recommendation for telephonenumbers. It is called: E.164 from the ITU-T (TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU)
Due to copyright issues, I must not post it here. But if you want additional input, you could contact ITU and ask them for a copy.
Best regards,
perl -le "s==*F=e=>y~\*martinF~stronat~=>s~[^\w]~~g=>chop,print" | [reply] |
Re: Beast of the Number: Parsing the Feral Phone
by htoug (Deacon) on Jan 16, 2003 at 08:20 UTC
|
Just another 0.02€:
In Denmark there is no areacode. Phone numbers are just 8-digit numbers with a possible extension (no standard for that, format etc depends on the local switchboard). The number is traditionally formatted as dd dd dd dd, but even that has begun to vary: some people use 2 groups of 3 digits and a 2 digit group, others 2 groups of 4 digits!
Earlier you could figure out which phonecentral the number was attached to, but following (EU instigated?) rulechanges you can take your number with you when you move from one part of the country to another. Thus there is no areacode, or all of Denmark (including cellular phones!) is in the same area.
Things do vary. | [reply] [d/l] |
|
Interesting fatctiod about Denmark having no area codes.
This is why I do not bother attempting to interpret the core number once I have dealt with IDD codes, country codes, extensions, and various representations of multiple numbers. Figuring out an area code is beyond the scope of this set of tools -- it does, however, make a huge step in providing a base phone number suitable for interpretation by a module tailored to a particular country or region.
I admit that my *display* functionality is US-centric. It chunks the core number width 4 digits on the suffix, preceded by groups of three (or less if the first digits). So in your example, dd dd dd dd would come out looking like d ddd dddd.
That's just cosmetic, however. The internal representation makes no distinction for area codes of any sort. The PhoneNumber.pm module can be provided with new chunked_number() and as_string() methods suitable for any locale. If I ever put it on CPAN I would attempt to structure it so that subclasses could easily provide for internationalization (perhaps based on country code, but with a default format for the local region).
(it's worth repeating that the actual display of these numbers was more of an afterthought -- the main thrust is the normalization and parsing of unverified and unruly international phone number strings)
Matt
| [reply] [d/l] [select] |
Re: Beast of the Number: Parsing the Feral Phone
by Abigail-II (Bishop) on Jan 16, 2003 at 09:35 UTC
|
Interesting stuff. I've been considering to add phone numbers
to Regexp::Common, and this work might be helpful.
Abigail | [reply] [d/l] |
|
I was just looking at Regexp::Common, and noticed that phone numbers are still on the TODO list according to the POD. Do you know if phone numbers are any closer to being added to that module?
| [reply] |
|
|