harryf has asked for the wisdom of the Perl Monks concerning the following question:
Just an obsessive question - is it possible to do this with a single regex search/replace?
I'm formatting a number (always a whole number) to insert a comma every at three digits from the right of the number so e.g, 1234 becomes 1,234 and 12345678 becomes 12,345,678
Able to do it with successive matches and concatenation but wondering if this can be done with a single regex replace?
sub commas {
my $num = shift;
my $f = '';
while ( my ($r, $m) = $num =~ m/([0-9]+)([0-9]{3})$/ ) {
$f = $r ? ",$m$f" : "$m$f";
$num = $r;
}
return "$num$f";
}
printf("%s\n",commas(12345678)); # 12,345,678
Know there's Number::Format but, like I say - just being obsessive in search of a one-liner (and wanting to avoid an dependency as well)
Re: Formatting a number
by Corion (Patriarch) on Oct 16, 2006 at 12:42 UTC
|
| [reply] [d/l] |
|
Thanks - missed that (and I guess I should have searched harder as this is a command problem: apologies).
Nice regex though - finally see the point of \G ;)
$num =~ s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
Thanks. | [reply] [d/l] |
Re: Formatting a number
by holli (Abbot) on Oct 16, 2006 at 13:23 UTC
|
use Number::Format;
my $de = new Number::Format(-thousands_sep => '.',
-decimal_point => ',',
-int_curr_symbol => 'DEM');
my $formatted = $de->format_number($number);
And while you're at it, have a look at Number::Format::Calc.
| [reply] [d/l] |
Re: Formatting a number
by PreferredUserName (Pilgrim) on Oct 16, 2006 at 12:44 UTC
|
Not a one-liner, but I've always used something
like this:
sub commafy
{
my $num = reverse shift;
my @threes = $num =~ /(.{1,3})/g;
return scalar reverse join ',', @threes;
}
| [reply] [d/l] |
|
You can speed that up using unpack:
sub commafy
{
return scalar reverse join ',', unpack '(a3)*', reverse shift
}
| [reply] [d/l] |
Re: Formatting a number
by Molt (Chaplain) on Oct 16, 2006 at 12:52 UTC
|
You're going to have a lot of people doing a lot of different versions for this.. and here's mine!
#!/usr/bin/perl
use strict;
use warnings;
print (numformat(1000000),"\n");
print (numformat(123),"\n");
print (numformat(12),"\n");
sub numformat {
my ($num) = @_;
die "'$num' is not a simple integer" unless $num =~ /^\d+$/;
while ($num =~ s/(\d)(\d\d\d)(?!\d)/$1,$2/) {};
return $num;
}
| [reply] [d/l] |
Re: Formatting a number
by cephas (Pilgrim) on Oct 16, 2006 at 12:51 UTC
|
Super search for 'commas' turned up this thread. | [reply] |
Re: Formatting a number
by johngg (Canon) on Oct 16, 2006 at 18:34 UTC
|
I've run some benchmarks against some of the solutions given and it turns out that the one I gave here using split/reverse/map/reverse performs like a dog. However, I found another way with substr that does much better. Here are the benchmarks
Rate Split Molt FAQ Cdarke PrefUN Substr
Split 8.00/s -- -15% -31% -39% -53% -74%
Molt 9.36/s 17% -- -19% -29% -45% -70%
FAQ 11.5/s 44% 23% -- -12% -32% -63%
Cdarke 13.2/s 64% 41% 14% -- -22% -58%
PrefUN 16.9/s 111% 80% 47% 28% -- -46%
Substr 31.3/s 291% 234% 171% 138% 85% --
and here is the benchmark code
I hope this is of interest.
Cheers, JohnGG
Update: New benchmarks incorporating corrected split/reverse/map/reverse routine, thanks jwkrahn, and also jwkrahn's improvement to PreferredUserName's solution.
Rate Split Molt FAQ Cdarke PrefUN Jwkrahn Substr
Split 7.85/s -- -16% -31% -41% -54% -70% -75%
Molt 9.31/s 19% -- -19% -31% -45% -64% -70%
FAQ 11.4/s 46% 23% -- -15% -32% -56% -63%
Cdarke 13.4/s 71% 44% 17% -- -21% -48% -57%
PrefUN 16.9/s 115% 81% 48% 26% -- -35% -45%
Jwkrahn 25.9/s 230% 178% 126% 93% 53% -- -16%
Substr 30.9/s 293% 231% 170% 130% 83% 19% --
Code in readmore tags revised, | [reply] [d/l] [select] |
|
You can shave a few points off of your substr sub by taking the abs and length functions out of the loop:
sub useSubstr
{
my $number = shift;
my $length = -( 1 + length $number );
my $offset = -3;
while ( $offset > $length )
{
substr $number, $offset, 0, q{,};
$offset -= 4;
}
return $number;
}
| [reply] [d/l] |
|
Problem with that is the length is increasing as you insert commas from the right so it needs to be tested dynamically. Unfortunately, your modification breaks the routine.
use strict;
use warnings;
my @nos = qw{
1
12
123
1234
12345
123456
1234567
12345678
123456789
1234567890
12345678901
123456789012
1234567890123
12345678901234};
printf qq{%20s\n}, useSubstr($_) for @nos;
sub useSubstr
{
my $number = shift;
my $length = -( 1 + length $number );
my $offset = -3;
while ( $offset > $length )
{
substr $number, $offset, 0, q{,};
$offset -= 4;
}
return $number;
}
produces
1
12
,123
1,234
12,345
123,456
1,234,567
12,345,678
123,456,789
1234,567,890
12,345,678,901
123,456,789,012
1234,567,890,123
12345,678,901,234
As you can see, it breaks out of the while too soon as it has not kept up with the increasing length of the number as the commas go in. I agree that it would be nice to factor abs and length out of the loop but can't quite see how to achieve it. Cheers, JohnGG | [reply] [d/l] [select] |
|
|
Re: Formatting a number
by cdarke (Prior) on Oct 16, 2006 at 12:51 UTC
|
my $num = reverse shift;
$num =~ s/(\d{3})/$1,/g;
$num = reverse $num;
Correction: The code I supplied places a comma at the front if the number of digits is divisible by 3. Need to append: $num =~ s/^,//; Now its gone from ugly to really ugly - apologies | [reply] [d/l] [select] |
Re: Formatting a number
by johngg (Canon) on Oct 16, 2006 at 14:26 UTC
|
This is not really answering the OP as it is not a one-liner nor does it use a regex. But, in the spirit of TIMTOWTDI
use strict;
use warnings;
printf qq{%20s\n}, comma3($_)
for (12, 1234, 1234567, 1234567890);
sub comma3
{
my $ct = 0;
join q{},
reverse map {++ $ct % 3 ? $_ : ($_, q{,})}
reverse split m{}, $_[0];
}
produces
12
1,234
1,234,567
1,234,567,890
Cheers, JohnGG
Update: Subroutine was erroneously putting commas at the front of numbers with lengths divisble by three, thanks jwkrahn for pointing this out. Revised subroutine (which is, of course, even slower) below
sub comma3
{
my $ct = 0;
my $len = length $_[0];
return join q{},
reverse map
{
++ $ct % 3
? $_
: $ct == $len
? $_
: ($_, q{,})
}
reverse split m{}, $_[0];
}
Benchmarks will be updated in a bit. | [reply] [d/l] [select] |
|
$ perl -le'
my $number = 123456789;
my $ct = 0;
$number = join q{},
reverse map { ++$ct % 3 ? $_ : ( $_, q{,} ) }
reverse split //, $number;
print $number;
'
,123,456,789
Oops, You have an extra comma at the beginning.
| [reply] [d/l] |
|
So, not only slow but wrong, damn!
| [reply] |
Re: Formatting a number
by jwkrahn (Abbot) on Oct 16, 2006 at 15:03 UTC
|
If your system supports it:
$ perl -le'$x = 123456789; print qx/printf "%\047d" $x/'
123,456,789
Hopefully the next version of Perl will support the ' flag for printf(3) :-)
| [reply] [d/l] [select] |
Re: Formatting a number
by smokemachine (Hermit) on Oct 16, 2006 at 20:10 UTC
|
perl -e '$num=12345678;$num=~s/(\d{1,3})$//,$num2=(($num)?",":"").$1.$num2 while$num;print$num2' | [reply] [d/l] |
|
|