http://qs321.pair.com?node_id=291818

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I have a Perl script that uses the time() fuction to create a (hopefully) unique ID (not the greatest method I realise). However when in a loop and creating ID automatically my program can sometimes run a little too fast and the ID's created are the same. I therefore want to add a little delay of say 5 seconds, inbetween each run of the loop. Is this really bad Perl practise? How can I do this? I thought about this method, but just wanted to check with you guys.

foreach (@loop) { my $id=time(); print $id; sleep 5; }

Any idea's if this will work? This is of course a simplified version of my program just as an example.

Cheers,
Tom

janitored by ybiC: Retitle from "sleeping a script", to improve future searchability

Replies are listed 'Best First'.
Re: Use time() to create unique ID
by benn (Vicar) on Sep 16, 2003 at 14:00 UTC
    There are many ways round this without having to resort to a sleep - the most obvious being Time::HiRes, which (although system-dependant) will give you a much higher resolution than time().

    Another way is to add a counter variable, something like (untested)...

    my %idcount; my @ids; for (0..5){ my $t=time(); push @ids, $t."-".++$idcount{$t}; }
    ...resulting in "12345678-1","12345678-2" etc.

    Cheers, Ben.

      Of all the stuff I've seen on the current thread, this idea seems easiest and most sensible. And if there were going to be multiple instances of the same script running at once, all you need to do is add the pid ($$) to the file name as well.

      Another way to avoid name collisions is to work out a unique name in a loop like this:

      ... my $append = "a"; while ( -e $id[$i] ) { $id[$i] =~ s/(?:\.[a-z]+)*$/.$append/; $append++; } `touch $id[$i]`; ...
      Use a separate semaphore file to lock up this part, if necessary, to avoid race conditions among concurrent jobs. But this is probably overkill for the OP's task (and it takes more overhead -- without adding any real value -- relative to the simple time.pid.incrementer, which would only risk collision if concurrent processes were writing to one shared directory from different hosts, and happened to have the same pids at the same time -- wow, how unlikely is that?).

        <voice id="Lt.Cmdr Data"> The odds are approximately 3,279,967,300,002 to 1</voice>

        <voice id="Q"> You humans have such a limited concept of space and time! You didn't allow for alternate unverses where what you term "Quantum Mechanics" is the everyday norm</voice>

        <voice id="Major S. Carter">Wouldn't we'd also have to consider the effects of parallel universes?</voice>

        <voice ID="Albert Einstein">And to tink zey laughed at my theoriezzzz.</voice>

        <voice ID="Stephen Hawkin">The- whole - thing - can - be - easily - visualised - in -terms - of - the - likelyhood - of - two - billard - balls - dropping - into - the - same - pocket - at - the - same - time - from - the - breakoff.</voice>

        <voice ID="Z. Beedlebrox">Now if only we could make the destination of the Infinite Probablity Drive a little more predicable..</voice>


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: Use time() to create unique ID
by jasonk (Parson) on Sep 16, 2003 at 13:53 UTC

    If you really want to use time as the unique ID, you could just use Time::HiRes 'time';, which will make the time() function include fractional seconds.

    [jkohles@lifeform jkohles]$ perl use Time::HiRes 'time'; print time(),"\n"; __END__ 1063720638.23481

    We're not surrounded, we're in a target-rich environment!
Re: Use time() to create unique ID
by waswas-fng (Curate) on Sep 16, 2003 at 16:23 UTC
    You can use File::CounterFile or roll your own. A persistant counter can give you unique IDs even if your code is looping so tight that time::hires gives the same value.

    -Waswas
Re: Use time() to create unique ID
by adrianh (Chancellor) on Sep 16, 2003 at 16:50 UTC

      Using time or even time::Hires does not Using time or even time::Hires does not guarantee you uniqueness in any manner. How do you know your application will not be run more then once concurrently?

      BTW are you using human time or ctime? Don't forget about day light savings time, clock adjustments from ntp, server moves to a different timezone, etc.,etc.,etc.

      Clocks change.

      Are you sure? What about when you borrow your code and reuse it on a busier site and forget about that hidden feature?

      I would have to vote for the either the use of rand, or the use of Data::UUID (see link above). I never played with Data::UUID, but it looks like something to add to the list to check out.

      As cowdawg stated the probability of a collistion is low, and you need to VERIFY that the number is unique.

      BTW a name collistion is generally always possible no matter what you do.

      Heck, if you are convienced that you MUST use time(), or a varient, append a random number to the "time,"

      My $1.50,

      - smellysocks

Re: Use time() to create unique ID
by blue_cowdawg (Monsignor) on Sep 16, 2003 at 15:42 UTC

    Instead of using time to generate your unique ids how about a derivative of rand()? Such as:

    use strict; use constant id_length => 128; foreach (0..3) { printf "%s\n",genUniqueID(); } exit(0); sub genUniqueID { my @alpha=('A'..'Z','a'..'z','0'..'9'); my $buffer=""; foreach (1.. id_length) { $buffer = $buffer . $alpha[rand($#alpha+1)]; } return $buffer; } __END__
    A sample run produces:
    --$ perl unique_id.pl vNJaCW91KaRqtGuVKffRY1ufjOWN8O09h8C2QL28mdNWoR +fuLVBawYxWuDLC6L2q2LYPoyyiit6L7jb9OYP3ZbU4Jdf9A1pQMwOBppsEpVEg5HdCijL +GlvPSMDe14ANL 8W6voRR5r1B2zai2aUEYRfC2tXtJKoI1jU0J9gWP7hXdrMV8oQ6qTbQa3B9U6ebc99eOM8 +TeNacHwUuvFmakIYCWqIfrwjwE01bhxhGcfHKJOcbapt6fRWqhoalTzutb GDqdFLUCWe1pichxfUFdQybLLmzsFjdFC2baq7Ec12ftGp6sckbvKrbeGmdt5wj7HYuQ5B +nOJQB5eGERsWiMolfHm4f7xYFf6UVfENLhyEn2CNOp55Wh1sajtq6ZOV3T I4V3YVHak0pKAwN0V4rLdvAXFRqz1lSCZ9LnDHdZLbPLDQrzd1dJx5iFCXqH4GrEaMgB05 +DzMUYSTW00y6neDrGOWVphi1xZ2PMxrDilKTJCxBkB5P8oegJCCeI43FpN --$

    Please note that you can modify the length of the unique key by changing the value in the use constant statement to be whatever you want. However the longer the key the more likely it is to be unique.


    Peter L. Berghold -- Unix Professional
    Peter at Berghold dot Net
       Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
      Rand offers the high probability of scarcity in a finite dataset, but it doesn't guarantee uniqueness. And in an infinate dataset (which is, of course, only theoretical) using rand will result in an infinate number of duplications. Even in a small dataset, though highly improbable, there is no guarantee that rand wont return .984553 followed by .984553 within a few iterations. It's within the realm of possibility, even if unlikely.

      If you don't want to use the unique user id module, perhaps you could use a combination of: Process ID, Time, and an in-loop counter. If you want to rely on rand, don't call the ID "unique". Call it "rare". Just because something is improbable doesn't mean it's unique. And why settle for scarcity in a situation where you require uniqueness, when it is truly not that difficult to develop a solution that provides what is actually needed?

      Dave

      "If I had my life to do over again, I'd be a plumber." -- Albert Einstein

            Rand guarantees scarcity in a finite dataset, but it doesn't guarantee uniqueness.

        Let's not get hung up in the difference between the practical and the theoretical here. :-)

        For the purposes stated by the OP using rand() is "good enough." Also based on my own practical use using this method to generate unique session ids for web transactions I have found that it works very well.

        When using this method in my own applications I have very deliberately set up trapping logic checking to make sure that a generated session id is not already in use and if it ever happens the logic logs the incident. The log is still empty for one application I use it for and that web application was installed in August of 2001. Over two years now and no collisions. I think that works pretty darn good.

        Truly random and unique ids

        The one time I needed to generate truly random numbers for an application I wrote (I could tell you what it was but then I'd have to shoot you) :) I decided the best way to do it was taking a page from PGP and GNUpg and use system entropy. Stuff like watching the position of the system disk heads, being influenced by system interrupts (mouse, keyboard, etc.) and stuff like that.

        You can make yourself nuts with the whole subject and folks a lot smarter than me have made their academic mark on the world writing papers on the subject and there is even a whole field science dedicated to the subject. For practical purposes you have to make a decision as to what constitutes "random enough" and code accordingly. A random ID of 128 characters is probably going to be random enough for 99% of the uses out there .

        But then... we are getting way off topic here...


        Peter L. Berghold -- Unix Professional
        Peter at Berghold dot Net
           Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.

      You may be fooling yourself with your ID length of 128. On many platforms, the limiting factor is the size of the seed, which is often 32 bits. Running continuously, your code will produce duplicate keys after producing about 2^32/128 of them. (And there is no guarantee that you wouldn't produce a duplicate before that.) On subsequent runs, you have essentially the same problem. Random numbers would be bad enough. Pseudo random numbers create additional problems.

      -sauoq
      "My two cents aren't worth a dime.";
      

            You may be fooling yourself with your ID length of 128. On many platforms, the limiting factor is the size of the seed, which is often 32 bits.
        If you are attempting to start an argument with me, you failed. I agree with you. This is where some very serious testing needs to be done on any solution where you are attempting to produce unique keys of any sort. Especially where real "randomness" is required. Hence why elsewhere in this thread I make reference to using system entropy as they do in PGP and GNUpg and other cryptographic products.

        In 25 years of programming I have yet to see a truly random random number generator over a sufficiently large data set without the use of some external influence on the numbers being generated.

        This of course gets back to the basic premise that using time() and friends to produce the id may not be very ideal even for the simplest of application.

        However, I stand by my opinion that the degree of randomness you need is part of the design criteria you need to develop in your program specification and the criticality it has in relationship to the program you are writing and the data or transactions you are trying to protect. If I am generating unique IDS for sessions dealing with a guest book application (OK... so I'm exaggerating) then I am not going to worry too much about how random the key generation is. On the other hand if I am protecting national security data where lives are on the line then I am going to look to somebody like the NSA for guidance as to what the "latest and greatest" crypto algorithm is.


        Peter L. Berghold -- Unix Professional
        Peter at Berghold dot Net
           Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
Re: Use time() to create unique ID
by sulfericacid (Deacon) on Sep 16, 2003 at 15:34 UTC
    Like the other two have said, you could always use use Time::HiRes 'time';, instead of having to sleep. I recently ran into this same problem with a CGI file uploader (used the localtime() as a unique key and allowed 4 uploads at a time; which without saying, most of the time the files used the same key and overwrote the others). sleep may not be the best choice but I don't see anything other than time as a setback. Just keep in mind you're sleeping after each loop and a sleep 5; will surely take forever if you have a 10,000 item array!

    It's a safe method to ensure unique ID's, I still use this over use Time::HiRes 'time';.

    Good luck.

    "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

    sulfericacid