MD5-based Unique Session ID Generator

Replies are listed 'Best First'.
Re: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 14:21 UTC
I would think hostname is a pretty hefty operation for genarating a session id, I'm not sure but I think it does a DNS lookup. The ID is based on hostname, time, and some psuedo-random data. I've run a test with this to generate 50,000 IDs as fast as possible and check for collisions -- I didn't get any. I use this for session ids (which I took from one of the Apache::Session modules) `use Digest::MD5; $session_id = substr(md5_hex(md5_hex(time() . {} . rand() . $$)), 0, 3 +2);` [download] I ran it within the same process over 100,000 times with no collisions. This is sort of slow, but strong. Reducing the param for rand() will speed things, but make collisions more likely. I am no crypto expert, but from what I know, Its not really any stronger than if you didn't do it this way. Using MD5 and different text each time, it is highly unlikely that you will find a collision actually, that is just the nature of MD5 and hashing algorithms in general. -stvn	[reply] [d/l]
Re^2: MD5-based Unique Session ID Generator by pelagic (Priest) on Aug 19, 2004 at 15:09 UTC
Just the middle part of your expression`(time() . {} . rand() . $$)`helps making the session-id's unique. pelagic	[reply] [d/l]
Re^3: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 15:29 UTC
Very true, but the double `md5_hex()` doesn't hurt (as far as I know). As I said, I am no crypto expert, and my knowledge of these things is limited. But I would think that hashing a reasonably unique string to produce a pretty darn close to unique string, and then hashing it again to get (what I would assume is) an even closer to truely unique string is a good thing when generating session ids. Please though, if I am wrong, and the double hash provides no benefit let me know why, as I would be interested in knowing. -stvn	[reply] [d/l]
Re^4: MD5-based Unique Session ID Generator by ctilmes (Vicar) on Aug 19, 2004 at 20:01 UTC
Re^5: MD5-based Unique Session ID Generator by ctilmes (Vicar) on Aug 20, 2004 at 12:12 UTC
Re^4: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:37 UTC
Re^4: MD5-based Unique Session ID Generator by pelagic (Priest) on Aug 19, 2004 at 20:56 UTC
Re^5: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 22:11 UTC
Re^2: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:31 UTC
I hadn't thought of doubling the md5_hex operations -- nice tip, thank you. I am no crypto expert, but from what I know, Its not really any stronger than if you didn't do it this way. Using MD5 and different text each time, it is highly unlikely that you will find a collision actually, that is just the nature of MD5 and hashing algorithms in general. It's not MD5 use that causes issues -- it's the random data that one is hashing. If the text is always different, great -- but on systems with poor PRNG's (Win2k springs to mind), I have gotten MD5 collisions based on the fact that outputs weren't random enough - MD5 the same text twice, and you get the same digest each time. With the same algo above, except s/2345678/2345/, I had 11 collisions in 20,000 generated sessions. Not Good™. Again, though, I will have to try your much faster (and shorter) method and see if I get good results with a poor PRNG -- thanks!	[reply]
Re^3: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 22:19 UTC
Again, though, I will have to try your much faster (and shorter) method and see if I get good results with a poor PRNG -- thanks! Just FYI, see my reply/discussion above with pelagic regarding the use of the added "{}". This bit of it may of may not provide any benefit. -stvn	[reply]
Re: MD5-based Unique Session ID Generator by dragonchild (Archbishop) on Aug 19, 2004 at 14:00 UTC
How much stronger is this than `md5_hex( time, $$, time )` where $$ is spread over 150 Apache child processes? ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested	[reply] [d/l]
Re^2: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:24 UTC
I haven't tested that case, but it really doens't apply to what this is used for. Please see my update note in the description...	[reply]
Re: MD5-based Unique Session ID Generator by simonm (Vicar) on Aug 19, 2004 at 16:20 UTC
If you don't mind using 128 bits rather than 32, Data::UUID guarantees that you won't get duplicates, ever. `use Data::UUID; use constant IDGenerator => Data::UUID->new(); sub new_sid { IDGenerator->create() }` [download] Update: Duh, they're both the same size, 128 bits and 32 hex digits.	[reply] [d/l]
Re^2: MD5-based Unique Session ID Generator by hardburn (Abbot) on Aug 19, 2004 at 20:50 UTC
MD5 is 128 bits. It's 32 hex digits. I do prefer Data::UUID for this task, myself. Note that while it guarantees uniqueness, it doesn't guarantee unpredictibility, which may or may not be a problem for a given application. Most MD5/SHA1/whatever session generators have the same issue. "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.	[reply]
Re^2: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:46 UTC
If you don't mind using 128 bits rather than 32, Data::UUID guarantees that you won't get duplicates, ever. Thanks! I won't be able to test this immediately, but if it works (and it seems like it will), it will be most helpful. One point though, MD5 generates 32 hex digits representing 4 bits each - that's already 128 bits. Sorry if I was unclear about that.	[reply]
Re^2: MD5-based Unique Session ID Generator by adrianh (Chancellor) on Aug 21, 2004 at 12:35 UTC
Data::UUID guarantees that you won't get duplicates, ever. While Data::UUID is a good solution, it doesn't guarantee that "you won't get duplicates, ever" (heck - there are only 128bits after all :-) As the docs say... A UUID is 128 bits long, and is guaranteed to be different from all other UUIDs/GUIDs generated until 3400 CE. ... It provides reasonably efficient and reliable framework for generating UUIDs and supports fairly high allocation rates -- 10 million per second per machine -- and therefore is suitable for identifying both extremely short-lived and very persistent objects on a given system as well as across the network. So, it wouldn't be suitable if you were coding something up for The Long Foundation - or needed to allocate UUIDs really, really, really quickly :-) The full gory detail can be found in this IETF draft. All this complexity is, of course, why I like Data::UUID. People who are experts have taken the time to look hard at the algorithm, and I can have some confidence in it working well.	[reply]
Re^3: MD5-based Unique Session ID Generator by simonm (Vicar) on Aug 21, 2004 at 16:22 UTC
While Data::UUID is a good solution, it doesn't guarantee that "you won't get duplicates, ever" (heck - there are only 128bits after all :-) ... A UUID is 128 bits long, and is guaranteed to be different from all other UUIDs/GUIDs generated until 3400 CE. I didn't say "it will never produce duplicates" -- just that "YOU won't get duplicates" (unless you live for over a thousand years).	[reply]
Re^4: MD5-based Unique Session ID Generator by adrianh (Chancellor) on Aug 21, 2004 at 20:54 UTC
Re: MD5-based Unique Session ID Generator by guha (Priest) on Aug 19, 2004 at 14:22 UTC
I'm definitely not an expert on cryptos and related issues, but the loop looks suspicious in my eyes. Do you realize that you push anything between zero and 2 Mbytes through the MD5 routine, no wonder that it, sometimes i guess, takes time to generate a key.	[reply]
Re^2: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:41 UTC
Do you realize that you push anything between zero and 2 Mbytes through the MD5 routine, no wonder that it, sometimes i guess, takes time to generate a key. Considering that, it's actually remarkably speedy. If I'm just generating one key at a time, it is basically instantaneous. 50,000 keys took about 5 minutes (not bad, all things considered). In the application I have (see Update, please), I'm not generating more than 50 keys in a given 1s interval, but they absolutely must not duplicate. Still, I will be exploring some of the tips in this thread regarding faster ways to accomplish the same thing; hopefully I will remember to update my snippet when I get around to testing them!	[reply]
Re: MD5-based Unique Session ID Generator by pelagic (Priest) on Aug 19, 2004 at 14:12 UTC
Why do you use `time` 2 times in your list? It will be the same both times. pelagic	[reply] [d/l]
Re^2: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 14:29 UTC
Why do you use time 2 times in your list? It will be the same both times. I assume you are refering to dragonchild's code since the OP doesnt have `time` in there twice. It will not matter if the time is the same, the idea is to generate a (sorta) unique string, and it will do that. Once put through `md5_hex`, it wont much matter after that. MD5 will give you the true uniqueness, all you really need a a bit of entropy to get it started. -stvn	[reply] [d/l] [select]
Re^3: MD5-based Unique Session ID Generator by pelagic (Priest) on Aug 19, 2004 at 14:59 UTC
To add "time" a second time does not make the string more unique than with just once "time". It makes the theoretical entropy higher but that's not the target here as we are not defending hackers. We just want to avoid collisions. The uniqeness of the id's must be achieved before feeding them through MD5. pelagic	[reply]
Re^4: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 15:15 UTC
Re^5: MD5-based Unique Session ID Generator by MidLifeXis (Monsignor) on Aug 19, 2004 at 17:40 UTC
Re^3: MD5-based Unique Session ID Generator by Anonymous Monk on Jun 07, 2005 at 20:19 UTC
If we're talking about getting entropy, why don't we go with a better entropy source than the minor disparity between the two calls to time which at MOST will vary by one digit, which is not very entropic. Why don't you just call hotbits and grab some radioactive decay data in hex format, break it apart and loop over it to give us some real entropy. That WILL decidely minimize the chance of collisions. Since your already acting against data returned by Sys::Hostname, this should be right up the alley of what your doing.	[reply]


Your skill will accomplish what the force of many cannot
	PerlMonks