Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Random number

by Moshambo (Initiate)
on Jul 09, 2000 at 15:43 UTC ( #21692=perlquestion: print w/replies, xml ) Need Help??

Moshambo has asked for the wisdom of the Perl Monks concerning the following question:

Someone once told me that you shoudn't use perls random number generator in cgi scripts, is this true and how else can you generate random numbers in a cgi script.

Replies are listed 'Best First'.
RE: Random number (CGI Security)
by Russ (Deacon) on Jul 10, 2000 at 00:58 UTC
    As some other people have pointed out, it depends on what you intend to do with your random number. More specifically, what is your random number going to protect?

    Let me give you a spectrum of examples. Suppose you are building a shopping cart website, and you need to assign a number to each customer who visits. Here are some possibilities (I will discuss the ramifications below):

    • Since/If you are using a database (or some other target for DBI), you could just allow an autoincrementing ID to provide your CustomerID.
    • You could use a self-built "random" number (like a combination of time + IP address (like Netscape did...OOOPS!))
    • You could use some variant of the customer's name or other information for the unique identifier.
    • You could use a pseudo-random number, like Perl's built-in rand()
    • You could use a "real" random number generator, like Math::TrulyRandom or /dev/random on most UNIX boxes.

    The problem with most of these solutions (in descending order of "wrongness") is that they are easy for a cracker/malicious individual to find, guess or generate.

    • An incrementing ID is obviously easy to figure out. Make an account, look at your ID, and try the ID below yours.
    • Self-built "random" numbers are also pretty easy to figure out. Knowing what time it is, and watching where someone is coming from will give the black hat a very small number of possibilities to try.
    • Letting the customer pick the key (either voluntarily, or generated from other information) may be easy to guess, especially when the malcontent can simply look at how you generated his key.
    • Pseudo-random numbers are much harder to figure out, but the very definition of pseudo-random numbers means that, given a few numbers generated from the same source, you can (relatively) easily know ALL of the numbers which will follow.
    • "Real" random number generators (like Math::TrulyRandom, /dev/random, etc.) get their numbers from a non-deterministic source (like interrupt timing discrepancies). This (theoretically) does not cause any predictability in the output.

    Now, the second part of the equation is: "What data are you trying to protect?"

    If you are just going to display a random picture (like the "Monk icons" at the upper right of your screen), security is not your concern. Therefore, make the most efficient use of your time and use rand(). If a cracker guesses that the next icon to appear will be vroom's, who cares?

    If, however, you will be storing a customer's personal information (especially credit-card numbers) and allowing the user to view that information later...You would be shamefully negligent to use anything less than 128-bit or greater SSL, a truly random number for the CustomerID, strong, cryptographic-quality passwords... (and perhaps even that is not enough).

    Here is an example from one of my projects. We are building an e-commerce site which will allow users to order products, entering credit card information for payment. Users may upload graphics to use in the printed product. We have chosen not to allow the user to retrieve credit card data. They may view and edit their uploaded logos.

    For SessionIDs and CustomerIDs, we use truly random numbers. Because we do not store intensely sensitive data, we do not need to enforce strict, cryptographic-quality passwords. A Customer's work is important, so we use "real" random numbers to protect the Session. Images (uploaded logos) use auto-incrementing IDs, since they will be hidden behind the CustomerID (and/or SessionID). Customer's logos are their property, so we protect them with the random Customer key, but because logos are (presumably) already publicly available, we do not need the highest level of security for them. When ordering, we transmit credit-card information over SSL and protect the card info appropriately (e.g. NEVER send it via e-mail!), and do not allow a user to see that information again (so we do not have to inflict a random password and other sufficiently paranoid measures upon the hapless visitor). Order confirmation, which uses no sensitive data at all, happens with security-free (what other kind is there?) e-mail.

    Security is an ever-present concern in e-commerce. The heart of data security is cryptography. The heart of cryptography is random number generation. The weaker the random numbers, the weaker the cryptography; therefore, the weaker the security. Random numbers having anything to do with security must be the highest-possible quality. Your advice to avoid rand() in CGI is a direct reflection of this security mindset. If you need a random number to keep people out of places where they do not belong, you need the best random number you can get. rand() is not it.


Re: Random number
by httptech (Chaplain) on Jul 09, 2000 at 17:02 UTC
    It depends on how much randomness you really need. If you needed REALLY random numbers, I suppose you could use Math::TrulyRandom but that's probably overkill for anything short of cryptography. What do you want to do with the random number exactly? You may be better off (in the case of session IDs for example) using a formula like (time) + (a few pseudo-random bytes) or using the Time::HiRes module rather than a random number in order to guarantee uniqueness. Using the time as part of the ID helps narrow the window of opportunity for a duplicate number to occur.
RE: Random number
by BBQ (Deacon) on Jul 09, 2000 at 19:42 UTC
    Well Moshambo, there are those that argue that you shouldn't use randomness at all! I guess it just depends on what the purpose of the randomness is. For a random quote, or image, or anything else that is a bit more trivial, I see no problem at all in generating random results for display through rand. But, if you are trying to keep track of data, or generating unique strings to insert into a database, I would never use randomness. Regardless of the language used to generate it.

    Just my R$ 0.02 worth.

    # Trust no1!
      But, if you are trying to keep track of data, or generating unique strings to insert into a database, I would never use randomness. Regardless of the language used to generate it.

      Could you please elaborate as to why not and what alternatives there are? Thanks. TTFN.


      p.s. A Tale of Soul and Sword Eternally Re-told!

        As a matter of fact, I can, but I should warn you this can turn into an essay very quickly. :)

        1. Uniqueness of data
          There are two good reasons why one should never rely on randomness to keep track of data: luck and track.

          Lets take luck first, just think of the lottery. If you play games of fortune, you count on good fortune to keep you on the positive, winning, et al. When you rely on luck to generate unique strings, or cryptic information you rely on the same luck, only directed in the opposite direction.

          It would be like playing the slots in Las Vegas hoping that you never get a triple seven, or consecutive bells, or whatever it is that slots reward you with. You are counting on being rewarded with the lack of matches. If your application deals with sensitive data, luck should never be a factor to consider. After all, there's as much good luck as there is bad luck in this world (only Murphy would find an algorithm to prove that it can be worse).

          The second reason, track, is actually more obvious. Its just very hard to keep track of something if you are being random about it! It would be like counting cars, except instead of numbering them, you could (off of the top of my head) interview people on the street asking them what their favorite TV show is instead. You would come out with results like:
          South Park VW Bug
          Pinky & The Brain Porsche 911
          X-Files 72' Land Rover
          The Daily Show Cowboy Neal

          I've actually heard of people that use this sort of technique to memorize data for long periods of time, but for storing them in a database, it really doesn't seem to be very effective. (And on a side note, that isn't being random either) If you have hundreds of thousands of records, I bet you you'll start getting duplicate favorite shows, and even if you didn't, it would be hard as hell to tell what the car you had counted was in the 1st place.

        2. Track of Data
          I have (in contrast) two methods for keeping track of my data, and neither of them are the best there are, but they have been useful nonetheless. The first, and I believe most used method is by auto-incrementing an ID field. Defining it with a unique constraint in a database, and then auto-incrementing it as you add more info. This is pretty obvious, but just for the sake of it, lets say:
          1 VW Bug
          2 Porsche 911
          3 72' Land Rover
          4 Cowboy Neal

          The second method, which I use most frequently is a combination of time and process ID. The combination of both will give me unique data and two bits of information that are much more useful than the order of which they were entered into the database. Consider that the string being generated is "$^T$$". Every time we generate a new record, we have the epoch ($^T) and the current PID ($$) of when that record was created. Even if you have multiple records coming into the database, they can't be running under the same process ID, and therefor must be unique. (I have yet to see a machine that can spawn that many processes per second). And for examples sake, our table would look something like this (under my Win NT box):

          963198426-505 VW Bug
          963198679-503 Porsche 911
          963198688-505 72' Land Rover
          963198703-500 Cowboy Neal

        3. Conclusion
          If I had to sum it up, I'd just say, "Don't let fate take over your application. Fate can be good, but if there is one thing that you can count on, its that Murphy will make it bad." or as my father (a math freak) puts it, "Nothing is truly random, and there is no such thing as a perfect circle".

        # Trust no1!

RE: Random number
by Anonymous Monk on Jul 10, 2000 at 01:20 UTC
    my @Chars = ( "A" .. "Z", "a" .. "z", 0 .. 9); my $RandString = join("", @Chars[ map { rand @Chars } ( 1 .. 30 ) ]);
    Here's how I generate random strings. You can just have 0 .. 9 in @Chars if you want an integert. And change 1 .. 30 to loop however many times you like - this example creates a random string that is 30 chars long.


      my $RandString = join("", @Chars[ map { rand @Chars } ( 1 .. 30 ) ]);

      I'm wondering if I may inflict a question upon you--details about
      that statement. I understand the purposes of "components" within it,
      and I can certainly see the results when I run the script. But just
      how it's generating those results is not clear to me. Could I talk you
      into explaining a bit about how it works?


        It's not too tricky. The rand statement imposes a scalar context on @Chars, so it gets a random number bounded by the number of elements in the array.

        The map statement creates a 30 element list of those random numbers.

        That list is used as indexes into an array slice. (That means that the second argument to the join statement is a random set of 30 characters from the @Chars array.) They're joined into a string.

RE: Random number
by gronkulator (Sexton) on Jul 10, 2000 at 17:36 UTC
    Code to generate an unsigned random 16 bit integer:
    my $randbits="";
    open(URANDOM, "/dev/urandom") or die "Phooey: $!";
    read(URANDOM, $randbits, 2);
    $rand=unpack("S*", $randbits);
    printf("random: %d\n", $rand);
    Divide by whatever to produce a random integer in a specific range.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://21692]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2022-08-12 20:58 GMT
Find Nodes?
    Voting Booth?

    No recent polls found