Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > KISS4691, a potentially top-ranked RNG.

Reply
Thread Tools

KISS4691, a potentially top-ranked RNG.

 
 
Gib Bogle
Guest
Posts: n/a
 
      07-26-2010
Geoff wrote:

> I am no FORTRAN hacker but I think there's a difference between
> sign(x) and sign(1,x).


sign(1,x) returns the sign of x.
 
Reply With Quote
 
 
 
 
nmm1@cam.ac.uk
Guest
Posts: n/a
 
      07-26-2010
In article <(E-Mail Removed)>,
Harald Anlauf <(E-Mail Removed)> wrote:
>On Jul 24, 3:02=A0pm, geo <(E-Mail Removed)> wrote:
>> This RNG uses two seeds to fill the Q[4691] array by
>> means of CNG+XS mod 2^32. Users requiring more than two
>> seeds will need to randomly seed Q[] in main().
>> By itself, the MWC() routine passes all tests in the
>> Diehard Battery of Tests, but it is probably a good
>> idea to include it in the KISS combination.

>
>Does this mean that using different seeds will lead to
>streams that are always statistically independent
>(as long as one does not exhaust the RNG's period)?
>Or are there restrictions on the possible combinations
>of seeds?


No. Without checking it more carefully, I can't say definitely,
but it looks as if you would be very unlikely to notice a PRACTICAL
association with two different seeds, provided that you throw
away the first 10,000 or so numbers - and even that qualification
may be unnecessary. But, unless I have some missed some subtlety,
the sequences cannot be guaranteed to be pseudo-independent.

The only two methods I know of of guaranteeing pseudo-independence
are using coprime sequences and by choosing them using the spectral
test or equivalent. Even then, there are some qualifications on
what is meant by pseudo-independence. However, in practice, it's
rare to have trouble with high-quality generators.

>I am currently using D.E.Knuth's generator from TAOCP,
>which IIRC allows for 2^30-2 independent streams, and
>which are asserted to be independent, but being able to
>extend the "limit" would be nice.


Eh? Which version? There was assuredly nothing with those
properties in edition 2. The first papers on the topic were later.

You need to be very careful to distinguish separate (i.e. disjoint)
sequences from pseudo-independent ones, and FAR too many papers
written by people who ought to know better confuse the two. Doing
that is a common cause of seriously wrong answers in some types of
calculation.


Regards,
Nick Maclaren.
 
Reply With Quote
 
 
 
 
Harald Anlauf
Guest
Posts: n/a
 
      07-26-2010
On Jul 26, 10:09*pm, (E-Mail Removed) wrote:
> In article <(E-Mail Removed)>,
> Harald Anlauf *<(E-Mail Removed)> wrote:
>
> >Does this mean that using different seeds will lead to
> >streams that are always statistically independent
> >(as long as one does not exhaust the RNG's period)?
> >Or are there restrictions on the possible combinations
> >of seeds?

>
> No. *Without checking it more carefully, I can't say definitely,
> but it looks as if you would be very unlikely to notice a PRACTICAL
> association with two different seeds, provided that you throw
> away the first 10,000 or so numbers - and even that qualification
> may be unnecessary. *But, unless I have some missed some subtlety,
> the sequences cannot be guaranteed to be pseudo-independent.
>
> The only two methods I know of of guaranteeing pseudo-independence
> are using coprime sequences and by choosing them using the spectral
> test or equivalent. *Even then, there are some qualifications on
> what is meant by pseudo-independence. *However, in practice, it's
> rare to have trouble with high-quality generators.


I was shown funny experiences with simple generators... =

> >I am currently using D.E.Knuth's generator from TAOCP,
> >which IIRC allows for 2^30-2 independent streams, and
> >which are asserted to be independent, but being able to
> >extend the "limit" would be nice.

>
> Eh? *Which version? *There was assuredly nothing with those
> properties in edition 2. *The first papers on the topic were later.


http://www-cs-faculty.stanford.edu/~uno/news02.html#rng

> You need to be very careful to distinguish separate (i.e. disjoint)
> sequences from pseudo-independent ones, and FAR too many papers
> written by people who ought to know better confuse the two. *Doing
> that is a common cause of seriously wrong answers in some types of
> calculation.


For an application which needs to distribute a large problem over many
processors it is very useful to have many (pseudo-)independent
streams,
at least they should be practically indistinguishable from independent
ones. An accidental correlation is likely worse than a short period.

Harald
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      07-26-2010
In comp.lang.fortran http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> In article <(E-Mail Removed)>,
> Harald Anlauf <(E-Mail Removed)> wrote:

(snip)

>>Does this mean that using different seeds will lead to
>>streams that are always statistically independent
>>(as long as one does not exhaust the RNG's period)?
>>Or are there restrictions on the possible combinations
>>of seeds?


> No. Without checking it more carefully, I can't say definitely,
> but it looks as if you would be very unlikely to notice a PRACTICAL
> association with two different seeds, provided that you throw
> away the first 10,000 or so numbers - and even that qualification
> may be unnecessary. But, unless I have some missed some subtlety,
> the sequences cannot be guaranteed to be pseudo-independent.


My biggest complaint about the current standard RANDOM_SEED
is that it doens't provide a way to get a reliably good
seed from a default (likely 32 bit) integer.

There are many generators with extrememly long periods,
and correspondingly long state. As the designers of the RNG
are the ones likely to know how to choose a good seed, it
would seem they would be the best ones to supply a good
seed generator.

> The only two methods I know of of guaranteeing pseudo-independence
> are using coprime sequences and by choosing them using the spectral
> test or equivalent. Even then, there are some qualifications on
> what is meant by pseudo-independence. However, in practice, it's
> rare to have trouble with high-quality generators.


Now, one can supply an array of the appropriate length to
RANDOM_SEED(PUT=...), but how to generate such an array
from a smaller seed? There is no way to know.

(snip)

> You need to be very careful to distinguish separate (i.e. disjoint)
> sequences from pseudo-independent ones, and FAR too many papers
> written by people who ought to know better confuse the two. Doing
> that is a common cause of seriously wrong answers in some types of
> calculation.


-- glen
 
Reply With Quote
 
nmm1@cam.ac.uk
Guest
Posts: n/a
 
      07-26-2010
In article <(E-Mail Removed)>,
Harald Anlauf <(E-Mail Removed)> wrote:
>>
>> The only two methods I know of of guaranteeing pseudo-independence
>> are using coprime sequences and by choosing them using the spectral
>> test or equivalent. =A0Even then, there are some qualifications on
>> what is meant by pseudo-independence. =A0However, in practice, it's
>> rare to have trouble with high-quality generators.

>
>I was shown funny experiences with simple generators... =3D


Yup. Very common.

>> >I am currently using D.E.Knuth's generator from TAOCP,
>> >which IIRC allows for 2^30-2 independent streams, and
>> >which are asserted to be independent, but being able to
>> >extend the "limit" would be nice.

>>
>> Eh? =A0Which version? =A0There was assuredly nothing with those
>> properties in edition 2. =A0The first papers on the topic were later.

>
>http://www-cs-faculty.stanford.edu/~uno/news02.html#rng


Ah. Thanks. It's new, then. I haven't been active in this area
ina couple of decades, so haven't been keeping track.

>> You need to be very careful to distinguish separate (i.e. disjoint)
>> sequences from pseudo-independent ones, and FAR too many papers
>> written by people who ought to know better confuse the two. =A0Doing
>> that is a common cause of seriously wrong answers in some types of
>> calculation.

>
>For an application which needs to distribute a large problem over many
>processors it is very useful to have many (pseudo-)independent
>streams,
>at least they should be practically indistinguishable from independent
>ones. An accidental correlation is likely worse than a short period.


Right. That's precisely why I got involved.


Regards,
Nick Maclaren.
 
Reply With Quote
 
robin
Guest
Posts: n/a
 
      07-27-2010
"Geoff" <(E-Mail Removed)> wrote in message news:(E-Mail Removed)...
| On Mon, 26 Jul 2010 23:32:27 +1000, "robin" <(E-Mail Removed)>
| wrote:
|
| >"Gib Bogle" <(E-Mail Removed)> wrote in message news:i2ij74$kd6$(E-Mail Removed)...
| >| geo wrote:
| >| > Thanks very much for the Fortran version.
| >| > I made a mistake in the comment on versions
| >| > for signed integers. This:
| >| >
| >| > Languages requiring signed integers should use equivalents
| >| > of the same operations, except that the C statement:
| >| > c=(t<x)+(x>>19);
| >| > can be replaced by that language's version of
| >| > if sign(x<<13+c)=sign(x) then c=sign(x)+(x>>19)
| >| > else c=1-sign(x<<13+c)+(x>>19)
| >| >
| >| > Sorry for that error.
| >|
| >| That produces different c values from those generated by the method based on the
| >| value of (t<x), therefore it deviates from the C code. This is what I used:
| >|
| >| m = shiftl(x,13) + c
| >| if (sign(1,m) == sign(1,x)) then
| >| c = sign(1,x) + shiftr(x,19)
| >| else
| >| c = 1 - sign(1,m) + shiftr(x,19)
| >| endif
| >
| >Maybe I missed something,
| >but isn't this exactly equivalent to what George wrote?
| >Just substitute x<<13+c for m in your last two assignments ...
|
| I am no FORTRAN hacker but I think there's a difference between
| sign(x) and sign(1,x).

George gave general advice on how to do it.
That advice wasn't specific to Fortran.
It's necessary to use the appropriate Fortran construct --
and sign(1,x) is the only way to do that.


 
Reply With Quote
 
robin
Guest
Posts: n/a
 
      07-27-2010
"geo" <(E-Mail Removed)> wrote in message news:(E-Mail Removed)...
|I have been asked to recommend an RNG
| (Random Number Generator) that ranks
| at or near the top in all of the categories:
| performance on tests of randomness,
| length of period, simplicity and speed.
| The most important measure, of course, is
| performance on extensive tests of randomness, and for
| those that perform well, selection may well depend
| on those other measures.

I have already posted a PL/I version using unsigned arithmetic.

Here is another version, this time using signed arithmetic :--

(NOSIZE, NOFOFL):
RNG: PROCEDURE OPTIONS (MAIN, REORDER);

declare (xs initial (521288629), xcng initial (362436069),
Q(0:4690) ) static fixed binary (31);

MWC: procedure () returns (fixed binary (31));
declare (t,x,i) fixed binary (31);
declare (c initial (0), j initial (4691) ) fixed binary (31) static;
declare (t1, t2, t3) fixed binary (31);

if j < hbound(Q,1) then j = j + 1; else j = 0;
x = Q(j);
t = isll(x,13)+c+x;
t1 = iand(x, 3) - iand(t, 3);
t2 = isrl(x, 2) - isrl(t, 2);
if t2 = 0 then t2 = t1;
if t2 > 0 then t3 = 1; else t3 = 0;
c = t3 + isrl(x, 19);
Q(j)=t;
return (t);
end MWC;

CNG: procedure returns (fixed binary (31));
xcng=bin(69069)*xcng+bin(123);
return (xcng);
end CNG;

XXS: procedure returns (fixed binary (31));
xs = ieor (xs, isll(xs, 13) );
xs = ieor (xs, isrl(xs, 17) );
xs = ieor (xs, isll(xs, 5) );
return (xs);
end XXS;

KISS: procedure returns (fixed binary (31));
return ( MWC()+CNG+XXS );
end KISS;

declare (i,x) fixed binary (31);
declare y fixed decimal (11);

Q = CNG+XXS; /* Initialize. */
do i = 1 to 1000000000; x=MWC(); end;
put skip edit (" Expected MWC result = 3740121002", 'computed =', x)
(a, skip, x(12), a, f(11));
y = iand(x, 2147483647);
if x < 0 then y = y + 2147483648;
put skip edit (y) (x(11), f(22)); put skip;
do i = 1 to 1000000000; x=KISS; end;
put skip edit ("Expected KISS result = 2224631993", 'computed =', x)
(a, skip, x(12), a, f(11));
y = iand(x, 2147483647);
if x < 0 then y = y + 2147483648;
put skip edit (y) (x(11), f(22));

end RNG;


 
Reply With Quote
 
nmm1@cam.ac.uk
Guest
Posts: n/a
 
      07-27-2010
In article <(E-Mail Removed)>,
Geoff <(E-Mail Removed)> wrote:
>>|
>>| I am no FORTRAN hacker but I think there's a difference between
>>| sign(x) and sign(1,x).
>>
>>George gave general advice on how to do it.
>>That advice wasn't specific to Fortran.
>>It's necessary to use the appropriate Fortran construct --
>>and sign(1,x) is the only way to do that.
>>

>Thanks for that clarification. I have not done any Fortran code since
>1975 and it looked a whole lot different than it does today.


Actually, that aspect hasn't changed since SIGN(X) always was
an implementation-dependent extension (now called processor-dependent).


Regards,
Nick Maclaren.
 
Reply With Quote
 
robin
Guest
Posts: n/a
 
      07-27-2010
"jacob navia" <(E-Mail Removed)> wrote in message news:i2fir2$op4$(E-Mail Removed)...

| This doesn't work with systems that have unsigned long as a 64 bit quantity.
|
| I obtain:
|
| MWC result=3740121002 ?
| 4169348530
| KISS result=2224631993 ?
| 1421918629

For a 64-bit machine (using 64-bit integer arithmetic),
you'd need to truncate each result to 32 bits. That not
only applies to the multiplication, it also applies to addition, etc.
On a 32-bit machine, these extra bits are discarded,
but in 64-bit arithmetic, they are retained,
and unless they are similarly discarded,
you won't get the same results.
I suggest using IAND(k, 2*2147483647+1)
for the truncation.

With such modifications in the program,
it should then produce the same results on both 32-bit and
64-bit machines.

P.S. the product 2*2... is best obtained using ISHFT.

| Compiling with 32 bit machine yields:
| MWC result=3740121002 ?
| 3740121002
| KISS result=2224631993 ?
| 2224631993



 
Reply With Quote
 
Uno
Guest
Posts: n/a
 
      07-29-2010
glen herrmannsfeldt wrote:
> In comp.lang.fortran (E-Mail Removed) wrote:
>> In article <(E-Mail Removed)>,
>> Harald Anlauf <(E-Mail Removed)> wrote:

> (snip)
>
>>> Does this mean that using different seeds will lead to
>>> streams that are always statistically independent
>>> (as long as one does not exhaust the RNG's period)?
>>> Or are there restrictions on the possible combinations
>>> of seeds?

>
>> No. Without checking it more carefully, I can't say definitely,
>> but it looks as if you would be very unlikely to notice a PRACTICAL
>> association with two different seeds, provided that you throw
>> away the first 10,000 or so numbers - and even that qualification
>> may be unnecessary. But, unless I have some missed some subtlety,
>> the sequences cannot be guaranteed to be pseudo-independent.

>
> My biggest complaint about the current standard RANDOM_SEED
> is that it doens't provide a way to get a reliably good
> seed from a default (likely 32 bit) integer.
>
> There are many generators with extrememly long periods,
> and correspondingly long state. As the designers of the RNG
> are the ones likely to know how to choose a good seed, it
> would seem they would be the best ones to supply a good
> seed generator.
>
>> The only two methods I know of of guaranteeing pseudo-independence
>> are using coprime sequences and by choosing them using the spectral
>> test or equivalent. Even then, there are some qualifications on
>> what is meant by pseudo-independence. However, in practice, it's
>> rare to have trouble with high-quality generators.

>
> Now, one can supply an array of the appropriate length to
> RANDOM_SEED(PUT=...), but how to generate such an array
> from a smaller seed? There is no way to know.


So for the put= values in fortran, you need a vector of pseudorandom
integers, which is as good as it gets without truly random devices,
making--one hopes-a period that is large with respect to the interval
you're interested in.

It doesn't seem like a problem with epistemology as much a mathematical
ceiling on how much randomness you can create by a handful of values.
--
Uno
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
A potentially dangerous querystring ... [ValidateRequest] Boris ASP .Net 5 04-17-2004 05:22 PM
A potentially dangerous Request.Form value was detected from the client amit ASP .Net 1 02-26-2004 09:47 PM
Why Getting 'A Potentially Dangerous Request...' Error? Anil Kripalani ASP .Net 2 02-25-2004 06:39 PM
A potentially dangerous Request.Form Alex Munk ASP .Net 2 12-17-2003 09:11 AM
Potentially Massive Internet Attack Starts Today =?iso-8859-1?Q?Frisbee=AE_MCNGP?= MCSE 14 08-26-2003 02:12 AM



Advertisments