Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Factorial algorithm execution time

by greenback (Initiate)
on Oct 25, 2002 at 17:21 UTC ( [id://208090]=note: print w/replies, xml ) Need Help??


in reply to Factorial algorithm execution time

A more robust algorithm: using the Lanczos approximation, calculate the natural log of Gamma(n+1) and exponentiate the result. You can simply cache n! for n < 30 or so.

Replies are listed 'Best First'.
Re: Re: Factorial algorithm execution time
by Anonymous Monk on Oct 26, 2002 at 18:39 UTC
    Define robust.

    Also isn't exponentiating the log of n! and carrying accuracy to the last integer is as much or more work than calculating n! in the first place? If you just want the first k digits, the approximation is better. Most of us don't.

    That said, the approximations are good for order of magnitude estimates. Understanding the basic approximations is easier than most think. Watch:

    log(n!) = log(1) + log(2) + ... + log(n) = 1/2 log(1) + (log(1) + log(2))/2 + (log(2) + log(3))/2 + ... + (log(n-1) + log(n))/2 + 1/2 log(n)
    But note that (log(i) + log(i+1))/2 is an approximation of the integral of log(x) from i to i+1. If we define the error of that approximation to be e(i), then we get:
    log(n!) = 1/2 0 + Integral from 1 to n of log(x) + 1/2 log(n) + e(1) + e(2) + ... + e(n-1)
    Which looks like a mess. What have we done other than obfuscation and complication?

    The answer is, "Quite a bit." The goal of approximation techniques is to take something, divide it into a few big terms, understand those terms, a big mess, show that the mess isn't important. This is what we have done.

    That integral is a single huge term. But it is one we understand. The integral of log(x) is x*log(x)-x. So that integral is n*log(n)-n-(1*log(1)-1) = n*log(n)-n+1. log(n)/2 is the next biggest thing around. And finally we have those e()'s. We think that they are small. But how to show that? It turns out that the error in a trapezoidal approximation is w**3 * f''(t)/12 where w is the width of the interval and t is a point in it. (This is reasonable - the error in a linear approximation is tied to what the function's bend does to the integral.) The second derivative of log(x) is -1/(x*x), so e(i) is O(1/(i*i)). If you remember your infinite series, that means that the infinite sum e(1)+e(2)+e(3)+... converges to some number, call it E. And the partial sum from n out to infinity is O(1/n) from that number.

    Remember the principle of approximation? Get a few big understood terms, at the cost adding lots of messy small ones? We are about to do that where "lots" is an infinite number! Taking the above we get:

    log(n!) = 1/2 0 + Integral from 1 to n of log(x) + 1/2 log(n) + e(1) + e(2) + ... + e(n-1) = n*log(n) - n + 1 - 1/2 log(n) + (e(1) + e(2) + e(3) + ...) - (e(n) + e(n+1) + e(n+2) + ...) = n*log(n) - n + log(n)/2 + E + 1 - e(n) - e(n+1) - e(n+2) - ...
    Defining exp(E+1) to be K, we get:
    n! = K * sqrt(n) + n**n * (1 + O(1/n)) / e**n
    And from here the interested reader (if there are any) can go in two directions. The first is study the error terms. The second is to figure out K.

    The first is straightforward but messy. Just look at the infinite sum as an approximation to an integral. Write it as that integral and associated error terms. Get another term of the approximation, with smaller and more complex error terms. Wash rinse and repeat to your heart's desire.

    The second requires cleverness. I suggest sticking this approximation into the binomial expansion, and compare with the central limit theorem's approximation to show that K has to be 1/sqrt(2*pi). If you are clever enough about it you can actually derive the central limit theorem for the binomial expansion up to the constant, then show that the only value of K which makes the terms for (1+1)**n come out to 2**n for large n is 1/sqrt(2*pi).

    These are left as exercises because I am tired of typing. :-)

      Why do you think that truncating Stirling's series is preferable to approximating the more accurate Lanczos asymptotic series (which is valid for Re(n) > 0)? It is more 'robust' to calculate the logarithm of the gamma function because you are less prone to overflow for large values of n.
        I didn't say it was preferable. I tried to explain how you can derive one of these approximations.

        As for robustness, all of the solutions presented in this discussion use big integer arithmetic. Until you run out of memory, they are precise to the last digit.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://208090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-25 22:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found