http://qs321.pair.com?node_id=219306

Tanalis has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I'm writing a reporting script at the moment that needs to handle very large numbers and output them in an easy-to-read way. I've worked out a way to "commify" a number - taking, for example 1234567890 and returning 1,234,567,890.

The code that does this involves a regexp and a while loop, and while it works fine, it's not particularly clean or efficient:

sub commify { my $number = shift; while ($number =~ s/(^-*\d*\d)(\d{3})/$1,$2/) { ; } return $number; }

I'm interested to know if there's a "simpler" way to do this - ideally getting rid of the super-hideous while loop. I should point out I've tried adding /g to the regexp, but this doesn't have the same effect as the while loop - the pattern matches the entire string each time, and simply replaces the last 3 digits with a comma followed by the digits.

Any ideas/suggestions would be useful.

Thanks ..
-- Foxcub

Replies are listed 'Best First'.
Re: "Commifying" a number
by BrowserUk (Patriarch) on Dec 12, 2002 at 16:03 UTC

    #! perl -sw use strict; my $re_commify = qr/^([+-]?\d+?)(\d{3})(?>((?:,|\.|$)\d?))(.*)$/; for (-12..7) { my $num= eval("-123456789E$_"); printf '%31.13f commified becomes ', $num; 1 while $num =~ s/$re_commify/$1,$2$3$4/; printf "%s\n", $num; } __END__ c:\test>commify -0.0001234567890 commified becomes -0.000123456789 -0.0012345678900 commified becomes -0.00123456789 -0.0123456789000 commified becomes -0.0123456789 -0.1234567890000 commified becomes -0.123456789 -1.2345678900000 commified becomes -1.23456789 -12.3456789000000 commified becomes -12.3456789 -123.4567890000000 commified becomes -123.456789 -1234.5678900000000 commified becomes -1,234.56789 -12345.6789000000010 commified becomes -12,345.6789 -123456.7890000000000 commified becomes -123,456.789 -1234567.8899999999000 commified becomes -1,234,567.89 -12345678.9000000000000 commified becomes -12,345,678.9 -123456789.0000000000000 commified becomes -123,456,789 -1234567890.0000000000000 commified becomes -1,234,567,890 -12345678900.0000000000000 commified becomes -12,345,678,900 -123456789000.0000000000000 commified becomes -123,456,789,000 -1234567890000.0000000000000 commified becomes -1,234,567,890,000 -12345678900000.0000000000000 commified becomes -12,345,678,900,000 -123456789000000.0000000000000 commified becomes -123,456,789,000,000 -1234567890000000.0000000000000 commified becomes -1.23456789e+015 c:\test>

    Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
    Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
    Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
    Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

      Ah, yes. Indeed. Perhaps this should be obfuscated into a script called three_mile_island.pl ?

      ;)
      Matt

Re: "Commifying" a number
by BlueBlazerRegular (Friar) on Dec 12, 2002 at 21:45 UTC

    From the 'Perl Cookbook' (pages 64-65):

    sub commify { my $text = reverse $_[0]; $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; }

    Is this more efficient? Heck, I don't know, but since no one had mentioned it...

    Pat

Re: "Commifying" a number
by hiseldl (Priest) on Dec 12, 2002 at 16:49 UTC

      I'm interested to know if there's a "simpler" way to do this - ideally getting rid of the super-hideous while loop

    How about a recursive solution?

    sub commify { my($num) = @_; return $num if length($num)<4; return commify(substr($num,0,-3), $len).",".substr($num,-3,3); }
    ...no super-hideous while loop there, and no regex compilation either. ;-)

    The only downside is that this only works for integers.

    --
    hiseldl
    What time is it? It's Camel Time!

Re: "Commifying" a number
by sauoq (Abbot) on Dec 12, 2002 at 15:48 UTC

    Another way to do it... (but only works on newer perls with lookbehind.) I didn't realize it was a FAQ when I needed to do it. The theme is the same though. Repeatedly make substitutions until there are none left to make.

    1 while ( s/(?<!\b)(\d{3})(?:,|$)/,$1/ ); # Explained 1 while ( # Repeat until no more matches are made. s/(?<!\b)(\d{3}) # Match any three digits not preceded by a word +break. (?:,|$) # That come immediately before a comma or the en +d. /,$1/x # And insert a comma before them. );
    -sauoq
    "My two cents aren't worth a dime.";
    

      Ahem:). This doesn't appear work correctly on numbers with decimal places?


      Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
      Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
      Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
      Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

        sauoq sighs.

        True. It only works on integers. I should have said so.

        How about this...

        s/(\d+)/$_=$1;1while(s|(?<!\b)(\d{3})(?:\b)|,$1|);$_/e

        :-P

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: "Commifying" a number
by jreades (Friar) on Dec 12, 2002 at 13:04 UTC

    The closest that I can come, and I don't know if this fits the bill of being simpler, is this:

    use strict; my $number = 1234567890; print STDOUT "In: $number\n"; my $reversed = reverse($number); print STDOUT "Reversed: $reversed\n"; my @array = split/(\d{3})/, $reversed; shift @array; print STDOUT "Array: '" . join ("' - '", @array) . "'\n"; my $out = join("", map { $_ eq '' ? ',' : reverse($_) } @array); print STDOUT "Out: " . reverse($out) . "\n"; exit 0;

    This prints out:

    In: 1234567890 Reversed: 0987654321 Array: '098' - '' - '765' - '' - '432' - '1' Out: 1234,567,890

    Of course, this lends itself to a one-liner, and although I'm not sure if you can quite get there you can certainly reduce the number of lines. Also note that it falls flat on its face if you try to use a decimal or floating-point number.

    Maybe printf/sprintf can do what you want more simply than this?

    Update -- I got the right result by doing the following:

    my $out = substr(reverse(join("", map { $_ eq "" ? '' : $_ . ',' } spl +it/(\d{3})/, reverse($number))), 1);

    Readable? No. Usable? Barely. No while loop? Check.

Re: "Commifying" a number
by PodMaster (Abbot) on Dec 12, 2002 at 15:31 UTC
    I once wrote this little thing, for a file listing script. It doesn't take into acount decimals or negative numbers, since file sizes are always positive (should be ;)
    local $\="\n"; sub HUMANo { my $n = shift; my $c = 3; my $Ln = length($n); return $n if $Ln <= 3; while($c <= $Ln) { substr($n, - $c, 0, ','); # insert $c += 4; } return $n; } print HUMANo($_) for qw[ 1 11 111 1111 11111 111111 1111111 ]; __END__ 1 11 111 1,111 11,111 111,111 1,111,111
    Doesn't look any simpler to me though.


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      Empty files have a size of 0 bytes. And 0 isn't a positive number. ;-)

      Abigail

Re: "Commifying" a number
by John M. Dlugosz (Monsignor) on Dec 12, 2002 at 16:45 UTC
    I think a commify function shouldn't be so simple. It needs to look at the current rules for formatting a number, using the specified character seperator and digit group sizes. Currency and other numbers may have different rules.
Re: "Commifying" a number
by MarkM (Curate) on Dec 13, 2002 at 06:49 UTC

    I decided to take up your challenge. Create a single regular expression to 'commify' a number. Here is my entry:

        $number =~ s/(\d+?)(?=(?:\d{3})+(?:\.|\z))/$1,/g;

    Can anybody spot a case where this would not work?

    Since the regular expression is looping in C code, rather than Perl code, I expect this solution to execute faster.

      Sorry, but yes. It doesn't handle decimal places. Works well on integers though.

      C:\test>commify -0.0001234567890 commified becomes -0.000,123,456,789 -0.0012345678900 commified becomes -0.00,123,456,789 -0.0123456789000 commified becomes -0.0,123,456,789 -0.1234567890000 commified becomes -0.123,456,789 -1.2345678900000 commified becomes -1.23,456,789 -12.3456789000000 commified becomes -12.3,456,789 -123.4567890000000 commified becomes -123.456,789 -1234.5678900000000 commified becomes -1,234.56,789 -12345.6789000000010 commified becomes -12,345.6,789 -123456.7890000000000 commified becomes -123,456.789 -1234567.8899999999000 commified becomes -1,234,567.89 -12345678.9000000000000 commified becomes -12,345,678.9 -123456789.0000000000000 commified becomes -123,456,789 -1234567890.0000000000000 commified becomes -1,234,567,890 -12345678900.0000000000000 commified becomes -12,345,678,900 -123456789000.0000000000000 commified becomes -123,456,789,000 -1234567890000.0000000000000 commified becomes -1,234,567,890,000 -12345678900000.0000000000000 commified becomes -12,345,678,900,000 -123456789000000.0000000000000 commified becomes -123,456,789,000,000 -1234567890000000.0000000000000 commified becomes -1.23456789e+015 C:\test>

      Examine what is said, not who speaks.

Re: "Commifying" a number
by MarkM (Curate) on Dec 13, 2002 at 16:29 UTC

    I only realized that it incorrectly commifies decimal numbers after lying in bed for a few minutes, and I was too tired to get up and enhance it to fix the problem... :-)

    Try the following:

        $number =~ s/((?:\A[+-]?|\G)\d+?)(?=(?:\d{3})+(?:\.|\z))/$1,/g;

    Proof that RE's can be fun...

Re: "Commifying" a number
by Aristotle (Chancellor) on Dec 15, 2002 at 00:23 UTC
    Clearly a case for sexeger.
    sub commify { local $_ = reverse shift; /\./g; s/\G(\d{3})(\d)/$1,$2/g; scalar reverse $_ }

    Makeshifts last the longest.

      I agree that this is a case for sexeger.

      I ran your 'commify' sub through BrowserUk's commify tester above, and noticed that it could use a zero-width positive lookahead assertion (?=\d+) instead of the direct match of a digit after the (\d{3})(\d) (that almost works, but leaves some commification with 4 digits between comma's). And, since it uses a lookahead, the final $2 can be removed from the substitution. Here's my version of commify sexeger...

      sub commify { local $_ = reverse shift; /\./g; s/\G(\d{3})(?=\d+)/$1,/g; return scalar reverse $_; }

      Here's the results with the updated sexeger:

      $ perl commify.pl -0.0001234567890 commified becomes -0.000123456789 -0.0012345678900 commified becomes -0.00123456789 -0.0123456789000 commified becomes -0.0123456789 -0.1234567890000 commified becomes -0.123456789 -1.2345678900000 commified becomes -1.23456789 -12.3456789000000 commified becomes -12.3456789 -123.4567890000000 commified becomes -123.456789 -1234.5678900000000 commified becomes -1,234.56789 -12345.6789000000008 commified becomes -12,345.6789 -123456.7890000000043 commified becomes -123,456.789 -1234567.8899999998976 commified becomes -1,234,567.89 -12345678.9000000003725 commified becomes -12,345,678.9 -123456789.0000000000000 commified becomes -123,456,789 -1234567890.0000000000000 commified becomes -1,234,567,890 -12345678900.0000000000000 commified becomes -12,345,678,900 -123456789000.0000000000000 commified becomes -123,456,789,000 -1234567890000.0000000000000 commified becomes -1,234,567,890,000 -12345678900000.0000000000000 commified becomes -12,345,678,900,000 -123456789000000.0000000000000 commified becomes -123,456,789,000,000 -1234567890000000.0000000000000 commified becomes -1.23456789e+15

      Cheers!

      --
      hiseldl
      What time is it? It's Camel Time!

        that almost works

        Nice catch; should have been obvious when I wrote it.

        Btw, your lookahead assertion needn’t be quantified: (?=\d) will work exactly as (?=\d+) does, but won’t waste time matching extra digits.

        Makeshifts last the longest.

Re: "Commifying" a number
by Abigail-II (Bishop) on Dec 12, 2002 at 12:51 UTC
    Read the FAQ.

    Abigail

      Update: Unwhoops, I thought I'd seen RE: regexp for adding commas to a number before .. it works fine for anything over +/- 1000, but fails for 3 digit numbers or less (ie, if it can't commify, it won't return anything).

      If anyone can think of a fix to that regexp, that'd be good too .. *grins*.

      As far as the specific Q&A page goes, I'm not sure that it's relevant anyway - I thought I was asking something fairly specific (suggestions for an improvement to something I already have, not new code). Maybe I'm wrong ..

      -----

      Eep .. whoops, my mistake. I'd searched the FAQ, which turned up nothing, but evidently word order plays a bigger part than I realised .. :)

      Thanks anyway ..
      --Foxcub

        Listen to Abigail, it is indeed FAQ.

        `perldoc -q comma' yields

        Found in C:\Perl\lib\pod\perlfaq5.pod How can I output my numbers with commas added? This one will do it for you: sub commify { local $_ = shift; 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; return $_; } $n = 23659019423.2331; print "GOT: ", commify($n), "\n"; GOT: 23,659,019,423.2331 You can't just: s/^([-+]?\d+)(\d{3})/$1,$2/g; because you have to put the comma in and then recalculate +your position. Alternatively, this code commifies all numbers in a line regardless of whether they have decimal portions, are prec +eded by + or -, or whatever: # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca> sub commify { my $input = shift; $input = reverse $input; $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g; return scalar reverse $input; }
        How to RTFM is a wonderful guide which will introduce you to various perl resources.


        MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
        ** The Third rule of perl club is a statement of fact: pod is sexy.