![]() |
Random Seeding
If you only use a 32 bit seed for a random number generator,
does that mean you can only ever produce a maximum of 2^32 (approx 4 billion) different sequences? What about the Mersenne Twister, with it's massive period of 2^19937-1. Will you only ever have access to a tiny portion of this ring of numbers, if you only use a 32-bit seed? Will this set of sequences be confined to the beginning of the period, ie, your sequence will have to start at some place between index number 1 and 4 billion of the period (ignores all the possible data after this)?. Or have I misunderstood the concept of a seed? (I am using the original c code from... http://www.math.sci.hiroshima-u.ac.j...at/MT/emt.html) |
Re: Random Seeding
I would suggest that your possibly get 2^32 sequences with 2^19937-1
number of elements in each. |
Re: Random Seeding
> (ignores all the possible data after this)?.
this should have read: (ignores all the possible "starting points" after this)?. >I would suggest that your possibly get 2^32 sequences with 2^19937-1 >number of elements in each. That's kind of what I thought. It seems fairly limited. However, I just realised the Mersenne Twister c code provided by it's inventors allows for longer seeds. "Those who need initial seed with more than 32-bit length may use init_by_array() for initialization which admits an array of arbitrary length as seeds" I guess you could use an array seeded with the time, the process ID, and some compressed info from /dev/random or /dev/audio. Still I imagine it would still be very difficult to guarantee starting points distibuted uniformly around the ring of numbers? |
Re: Random Seeding
>If you only use a 32 bit seed for a random number generator,
>does that mean you can only ever produce a maximum of >2^32 (approx 4 billion) different sequences? Yes, assuming you always start a sequence with the first number generated after seeding. It may be possible to use a "secondary seed" by getting and throwing away N pseudo-random numbers before using one, which, depending on the generator, may or may not improve anything. >What about the Mersenne Twister, with it's massive period >of 2^19937-1. Will you only ever have access to a tiny >portion of this ring of numbers, if you only use a 32-bit seed? It is quite possible to cripple a good pseudo-random number generator with a poor seed. >Will this set of sequences be confined to the beginning of the >period, ie, your sequence will have to start at some place >between index number 1 and 4 billion of the period (ignores >all the possible data after this)?. I don't think there's any guarantee of that, and it would depend on the generator. Gordon L. Burditt |
Re: Random Seeding
In article <1147697987.020547.116350@i39g2000cwa.googlegroups .com> porterboy76@yahoo.com writes:
> If you only use a 32 bit seed for a random number generator, > does that mean you can only ever produce a maximum of > 2^32 (approx 4 billion) different sequences? Depends on the generator. With some you may generate entirely different sequences, with others you just generate different starting points in a single sequence. > What about the Mersenne Twister, with it's massive period > of 2^19937-1. Will you only ever have access to a tiny > portion of this ring of numbers, if you only use a 32-bit seed? No, you will ultimately get the whole range. If I understand it well, MT uses a linear congruence method, and they inherently have only a single loop as result. > Will this set of sequences be confined to the beginning of the > period, ie, your sequence will have to start at some place > between index number 1 and 4 billion of the period (ignores > all the possible data after this)?. Not at all. In the Mersenne Twister you have a state of 19937 bits. The state loops around all those different states, but when the state has the appropriate 19905 bits equal to 0 (representing the 32 bit starting points) is not clear. So you do indeed jump in a different position of the generator, but those positions are not regularly spaced around the loop. -- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131 home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/ |
Re: Random Seeding
porterboy76@yahoo.com writes:
>> (ignores all the possible data after this)?. > > this should have read: > > (ignores all the possible "starting points" after this)?. > >>I would suggest that your possibly get 2^32 sequences with 2^19937-1 >>number of elements in each. > > That's kind of what I thought. It seems fairly limited. > However, I just realised the Mersenne Twister c code > provided by it's inventors allows for longer seeds. > > "Those who need initial seed with more than 32-bit > length may use init_by_array() for initialization which > admits an array of arbitrary length as seeds" > > I guess you could use an array seeded with the time, > the process ID, and some compressed info from > /dev/random or /dev/audio. Still I imagine it would > still be very difficult to guarantee starting points > distibuted uniformly around the ring of numbers? <OT> If you have /dev/random, why would you bother with the time, the process ID, or /dev/audio? For that matter, if you have /dev/random, it's not entirely clear you need to bother with a Mersenne Twister (but the question is beyond my expertise). </OT> -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> We must do something. This is something. Therefore, we must do this. |
Re: Random Seeding
Well, that kind of leads into the next question I
was going to ask. How can I make the seeding work on all platforms? I have /dev/random on my linux box, but not on Windows (and I havent checked if its on my MAC, which is BSD based). Does Windows have an Entropy Pool like /dev/random that can be accessed by C. (OK, I guess that's slightly off topic). More on topic, do I have to use #ifdef statements to provide for entropy pools on different platforms? I was just going to use the functions provided by #include <time.h>, but I'm not happy that it gives me a wide enough range of seeds. (BTW, /dev/random may or may not be "random enough" compared to the MT, but it more than likely is not fast enough, since the disk must be accessed). |
Re: Random Seeding
Keith Thompson wrote On 05/15/06 16:13,: > porterboy76@yahoo.com writes: > >>>(ignores all the possible data after this)?. >> >>this should have read: >> >>(ignores all the possible "starting points" after this)?. >> >> >>>I would suggest that your possibly get 2^32 sequences with 2^19937-1 >>>number of elements in each. >> >>That's kind of what I thought. It seems fairly limited. >>However, I just realised the Mersenne Twister c code >>provided by it's inventors allows for longer seeds. >> >>"Those who need initial seed with more than 32-bit >>length may use init_by_array() for initialization which >>admits an array of arbitrary length as seeds" >> >>I guess you could use an array seeded with the time, >>the process ID, and some compressed info from >>/dev/random or /dev/audio. Still I imagine it would >>still be very difficult to guarantee starting points >>distibuted uniformly around the ring of numbers? > > > <OT> > If you have /dev/random, why would you bother with the time, the > process ID, or /dev/audio? > > For that matter, if you have /dev/random, it's not entirely clear you > need to bother with a Mersenne Twister (but the question is beyond my > expertise). > </OT> <OT> A reproducible seed (and a subsequent reproducible sequence) can be of value. Consider testing a module by throwing pseudo-random data at it; when a failure turns up you'd like to be able to regenerate the exact same data as part of verifying your fix. For the second matter, data rate is generally the issue. It takes a significant amount of time for a physical source to generate random data and pass it through assorted bias- removing filters. If you try to read a zillion random numbers from such a source, you may need to wait quite a while. It's more common to use /dev/random or whatever to concoct a "truly random" seed for a deterministic pseudo-random generator that can generate subsequent numbers at computer speeds. For even less predictability, the "truly random" source can also be used to perturb the deterministic generator now and then. Oddly enough, D.H. Lehmer used something very like this perturbation technique in his original (and oft-criticized) linear congruential generator. According to Knuth: It is not fair to blame Lehmer for using a "bad" random-number generator in 1948, since his actual use of the generator was quite valid. The ENIAC computer was a highly parallel machine, programmed by means of a plugboard; Lehmer set it up so that one of its accumulators was repeatedly multiplying its own contents by 23, mod 10^8+1, yielding a new value every few microseconds. Since this multiplier 23 is too small, we know that each value obtained by this process was too strongly related to the preceding value to be considered sufficiently random; but the durations of time between actual uses of the values in the special accumulator by the accompanying program were comparatively long and subject to some fluctuation. So the effective multiplier was 23^k for large, varying values of k! -- "The Art of Computer Programming, Volume II: Seminumerical Algorithms," section 3.3.1 Similar techniques can be found nowadays in games, where it is common to generate and discard a pseudo-random value each time the program polls its input devices; the eventual sequence used in the main part of the program thus depends in part on the delays in keystroke, mouse, or joystick timings, generally considered unpredictable if not downright twitchy. </OT> -- Eric.Sosman@sun.com |
Re: Random Seeding
In article <1147727418.413356.213640@j73g2000cwa.googlegroups .com> porterboy76@yahoo.com writes:
> (BTW, /dev/random may or may not be "random enough" > compared to the MT, but it more than likely is not fast > enough, since the disk must be accessed). No, /dev/random does *not* imply disk access. It is a pseudo device, just like /dev/zero. The random numbers are generated within the kernel, but I can find no place where it is stated how. It appears that most systems that implement it use statistics from kernel interrupts or whatever to get the information. In general it is cryptographically secure (which MT is not), but there is no way to ascertain whether it complies to standard randomness tests. So regarding your earlier question: > Does Windows have an Entropy Pool like > /dev/random that can be accessed by C. /dev/random does not come from an "entropy pool", it just generates random numbers within the kernel. When you need various subsequent random numbers /dev/random is a very bad idea. It appears to be good to generate just a single random number. When you need more, you can use the output as a seed for a standard random number generator (e.g. MT). When you are so bothered about the randomness of your numbers, you really should study the way the generators work. When you do that you will also appreciate that there are no generators that cover all possibilities. -- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131 home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/ |
Re: Random Seeding
> No, /dev/random does *not* imply disk access. It is a pseudo device,
> just like /dev/zero. Thanks, I didn't know that.... > When you need various subsequent random numbers /dev/random is a > very bad idea. It appears to be good to generate just a single > random number. When you need more, you can use the output as a > seed for a standard random number generator (e.g. MT). ....whereas I am well aware of this. I will certainly be sticking to the Mersenne Twister for random sequence generation. My itch is random seeding, and I want to scratch it in a safe way. With a 32 bit seed, I will only ever generate 4 billion different sequences, which I think emasculates the MT. By allowing seeding by an arbitrary array of 32-bit numbers in their c-code, the inventors have overcome this limitation, but I still need to generate this array of 32-bit numbers in a way that is unpredictable, and which works on all platforms. <time.h> tools certainly work on all platforms of interest to me (MAC, M$ & *NIX), but I am looking for one or two other orthogonal seeds, to fill my array (hence the query about /dev/random and PID). |
| All times are GMT. The time now is 11:33 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.