Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > KISS4691, a potentially top-ranked RNG.

Reply
Thread Tools

KISS4691, a potentially top-ranked RNG.

 
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      07-29-2010
In comp.lang.fortran Uno <> wrote:
(snip, I wrote)

>> My biggest complaint about the current standard RANDOM_SEED
>> is that it doens't provide a way to get a reliably good
>> seed from a default (likely 32 bit) integer.


>> There are many generators with extrememly long periods,
>> and correspondingly long state. As the designers of the RNG
>> are the ones likely to know how to choose a good seed, it
>> would seem they would be the best ones to supply a good
>> seed generator.

(snip)

> So for the put= values in fortran, you need a vector of pseudorandom
> integers, which is as good as it gets without truly random devices,
> making--one hopes-a period that is large with respect to the interval
> you're interested in.


In a large fraction of the cases, 2 billion different seeds
are enough, but one can still desire the appropriate randomness
from those different seeds.

> It doesn't seem like a problem with epistemology as much
> a mathematical ceiling on how much randomness you can create
> by a handful of values.


Given a default integer, one might fill an array with that
integer and use that as a seed. That might be good for some,
not so good for others. Even more, for standard Fortran such
should be done without knowing the range of values of an integer
variable.

R has two seeding functions, one that takes a full length state
array, and the other takes a single integer. That makes sense
to me.

-- glen

 
Reply With Quote
 
 
 
 
Uno
Guest
Posts: n/a
 
      07-30-2010
Gib Bogle wrote:
> geo wrote:
>> Thanks very much for the Fortran version.
>> I made a mistake in the comment on versions
>> for signed integers. This:
>>
>> Languages requiring signed integers should use equivalents
>> of the same operations, except that the C statement:
>> c=(t<x)+(x>>19);
>> can be replaced by that language's version of
>> if sign(x<<13+c)=sign(x) then c=sign(x)+(x>>19)
>> else c=1-sign(t)+(x>>19)
>>
>> should have been:
>>
>>
>> Languages requiring signed integers should use equivalents
>> of the same operations, except that the C statement:
>> c=(t<x)+(x>>19);
>> can be replaced by that language's version of
>> if sign(x<<13+c)=sign(x) then c=sign(x)+(x>>19)
>> else c=1-sign(x<<13+c)+(x>>19)
>>
>> Sorry for that error.

>
> That produces different c values from those generated by the method
> based on the value of (t<x), therefore it deviates from the C code.
> This is what I used:
>
> m = shiftl(x,13) + c
> if (sign(1,m) == sign(1,x)) then
> c = sign(1,x) + shiftr(x,19)
> else
> c = 1 - sign(1,m) + shiftr(x,19)
> endif


Gib, can you post your entire fortran version?
--
Uno
 
Reply With Quote
 
 
 
 
Gib Bogle
Guest
Posts: n/a
 
      07-30-2010
Uno wrote:

> Gib, can you post your entire fortran version?


OK, here's the Fortran. I've removed almost all the comments to save space.
If you uncomment the line in MWC():
c = c_this_works
you'll see how to reproduce the C code results.

Gib

module kiss_mod
implicit none

integer :: xs = 521288629
integer :: xcng = 362436069
integer :: Q(0:4690)

contains

integer function MWC()
integer :: m, x, i
integer, save :: c = 0, j = 4691
integer :: t, tLTx, c_this_works

if (j < 4690) then
j = j + 1
else
j = 0
endif
x = Q(j)
m = SHIFTL(x,13) + c
Q(j) = m + x
t = Q(j)

! This is the laborious determination of c
if ((t >= 0 .and. x >= 0) .or. (t < 0 .and. x < 0)) then
if (t < x) then
tLTx = 1
else
tLTx = 0
endif
elseif (x < 0) then
tLTx = 1
elseif (t < 0) then
tLTx = 0
endif
c_this_works = tLTx + SHIFTR(x,19)

! This c gives results inconsistent with the C code
if (sign(1,m) == sign(1,x)) then
c = sign(1,x) + SHIFTR(x,19)
else
c = 1 - sign(1,m) + SHIFTR(x,19)
endif

! c = c_this_works
MWC = Q(j)
end function

subroutine test_RNG4691
integer :: i, x, unsigned_result
integer :: n = 1000000000
do i = 0, 4690
xcng = 69069 * xcng + 123
xs = XOR(xs,SHIFTL(xs,13))
xs = XOR(xs,SHIFTR(xs,17))
xs = XOR(xs,SHIFTL(xs,5))
Q(i) = xcng + xs
enddo
do i = 1, n
x = MWC()
enddo
unsigned_result = 3740121002
write(*,*) "MWC result=3740121002 ?: ",unsigned_result,x
do i = 1,n
xcng = 69069 * xcng + 123
xs = XOR(xs,SHIFTL(xs,13))
xs = XOR(xs,SHIFTR(xs,17))
xs = XOR(xs,SHIFTL(xs,5))
x = MWC() + xcng + xs
enddo
unsigned_result = 2224631993
write(*,*) "KISS result=2224631993 ?: ",unsigned_result,x
end subroutine

end module

program main
use kiss_mod
call test_RNG4691
end program

 
Reply With Quote
 
Uno
Guest
Posts: n/a
 
      07-30-2010
Gib Bogle wrote:

> OK, here's the Fortran. I've removed almost all the comments to save
> space.
> If you uncomment the line in MWC():
> c = c_this_works
> you'll see how to reproduce the C code results.


Alright, thx, gib, looks like there's a couple rough corners:

$ gfortran geo1.f90 -o out
geo1.f90:63.32:

unsigned_result = 3740121002
1
Error: Integer too big for its kind at (1). This check can be disabled
with the option -fno-range-check

So, do I ask for something bigger than the default integer, use the
iso_c_binding, or disable the warning?

I think of 3.5 billion as a value that busts types in C. I would have
thought that was going on except for the second error:

geo1.f90:72.32:

unsigned_result = 2224631993
1
Error: Integer too big for its kind at (1). This check can be disabled
with the option -fno-range-check


Then there's this. It looks like the bitshifting capabilities that
fortran and C have, but neither MR&C nor my slightly-inappropriate C
reference have a SHIFTL. Do you have a definition for it?
geo1.f90:55.16:

xs = XOR(xs,SHIFTL(xs,13))
1
Error: Function 'shiftl' at (1) has no IMPLICIT type

Beautiful night with the cicadas humming. I don't know that I've ever
seen one which makes them, aesthetically, my favorite bug. Cheers,
--
Uno
 
Reply With Quote
 
Ron Shepard
Guest
Posts: n/a
 
      07-30-2010
In article <i2ssst$nfi$>,
glen herrmannsfeldt <> wrote:

> In a large fraction of the cases, 2 billion different seeds
> are enough, but one can still desire the appropriate randomness
> from those different seeds.


My understanding is that pseudorandom number generators return a
sequence of values with some period. The numbers in that sequence
eventually repeat with some, hopefully long, cycle. The put=array
argument gives the starting point within that sequence, but it doesn't
affect the cycle length or the "randomness" of the values, those things
are determined by the underlying algorithm, not by the initial seed.

The situation that needs to be avoided is to run the code once with one
seed, and then run it again with another seed that results in an overlap
of the sequences of values for the two runs. In some applications this
is unimportant, but in other applications it would be bad for two
supposedly independent runs to have a long sequence of identical values.
It would be nice if there were some prescription to generate initial
seeds that would avoid this situation. However, I don't really know how
this could be specified in the fortran standard without specifying the
algorithm itself (which is not done, apparently on purpose). Anyone
have any ideas how this could be done?

$.02 -Ron Shepard
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      07-30-2010
In comp.lang.fortran Ron Shepard <ron-> wrote:
> In article <i2ssst$nfi$>,
> glen herrmannsfeldt <> wrote:


>> In a large fraction of the cases, 2 billion different seeds
>> are enough, but one can still desire the appropriate randomness
>> from those different seeds.


> My understanding is that pseudorandom number generators return a
> sequence of values with some period. The numbers in that sequence
> eventually repeat with some, hopefully long, cycle. The put=array
> argument gives the starting point within that sequence, but it doesn't
> affect the cycle length or the "randomness" of the values, those things
> are determined by the underlying algorithm, not by the initial seed.


> The situation that needs to be avoided is to run the code once with one
> seed, and then run it again with another seed that results in an overlap
> of the sequences of values for the two runs. In some applications this
> is unimportant, but in other applications it would be bad for two
> supposedly independent runs to have a long sequence of identical values.
> It would be nice if there were some prescription to generate initial
> seeds that would avoid this situation. However, I don't really know how
> this could be specified in the fortran standard without specifying the
> algorithm itself (which is not done, apparently on purpose). Anyone
> have any ideas how this could be done?


Here is the suggestion. Add to RANDOM_SEED a form that allows
a single integer variable to be supplied as a seed source.

As far as I know, there is no requirement on period or otherwise
on the quality of the PRNG used, but the standard could suggest
that good seeds be selected from that integer. For example,
they could be chosen such that they have a fairly long sequence
before they overlap the sequence generated by another input
value. If the RNG only has 32 bits of state, or maybe less, then
it likely takes it directly.

I don't know the math of the newer PRNGs enough to say, but I
think it is possible to do that.

With the current standard there is no way to even suggest to
RANDOM_SEED that you want a good seed.

OK, here is another suggestion that could be implemented without
any change in the standard. In the case where the supplied
seed array has all except the first element zero, then select
a good seed based on that first element.

-- glen
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      07-30-2010
On Jul 29, 8:13*pm, Ron Shepard <ron-shep...@NOSPAM.comcast.net>
wrote:
> In article <i2ssst$nf...@speranza.aioe.org>,
> *glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
> > In a large fraction of the cases, 2 billion different seeds
> > are enough, but one can still desire the appropriate randomness
> > from those different seeds. *

>
> My understanding is that pseudorandom number generators return a
> sequence of values with some period. *The numbers in that sequence
> eventually repeat with some, hopefully long, cycle. *The put=array
> argument gives the starting point within that sequence, but it doesn't
> affect the cycle length or the "randomness" of the values, those things
> are determined by the underlying algorithm, not by the initial seed.
>
> The situation that needs to be avoided is to run the code once with one
> seed, and then run it again with another seed that results in an overlap
> of the sequences of values for the two runs. *In some applications this
> is unimportant, but in other applications it would be bad for two
> supposedly independent runs to have a long sequence of identical values. *
> It would be nice if there were some prescription to generate initial
> seeds that would avoid this situation. *However, I don't really know how
> this could be specified in the fortran standard without specifying the
> algorithm itself (which is not done, apparently on purpose). *Anyone
> have any ideas how this could be done?
>
> $.02 -Ron Shepard


On Jul 29, 8:13 pm, Ron Shepard <ron-shep...@NOSPAM.comcast.net>
wrote:
> In article <i2ssst$nf...@speranza.aioe.org>,
> glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
> > In a large fraction of the cases, 2 billion different seeds
> > are enough, but one can still desire the appropriate randomness
> > from those different seeds.

>
> My understanding is that pseudorandom number generators return a
> sequence of values with some period. The numbers in that sequence
> eventually repeat with some, hopefully long, cycle. The put=array
> argument gives the starting point within that sequence, but it doesn't
> affect the cycle length or the "randomness" of the values, those things
> are determined by the underlying algorithm, not by the initial seed.
>
> The situation that needs to be avoided is to run the code once with one
> seed, and then run it again with another seed that results in an overlap
> of the sequences of values for the two runs. In some applications this
> is unimportant, but in other applications it would be bad for two
> supposedly independent runs to have a long sequence of identical values.
> It would be nice if there were some prescription to generate initial
> seeds that would avoid this situation. However, I don't really know how
> this could be specified in the fortran standard without specifying the
> algorithm itself (which is not done, apparently on purpose). Anyone
> have any ideas how this could be done?
>
> $.02 -Ron Shepard


If an RNG has a large number of possible states (say, 2^192) then it's
unlikely for any two runs to EVER reach identical states by any means
other than starting from the same seed. This RNG has a period vastly
in excess of "large", so for practical purposes it just won't
happen.

However, because this is a combination of multiple component RNGs
there is the related issue of overlapping within the period of one or
more component RNGs. MWC has essentially zero chance of being in an
identical state for any reason other than identical seeding, but both
XS and CNG will frequently have identical states in two different
runs. There does not seem to be any detectable correlation however
unless (either MWC or (both XS and CNG)) have identical states between
two different runs. Which is rather uncommon, though it does happen
often enough to see it in the real world on occasion.
 
Reply With Quote
 
Gib Bogle
Guest
Posts: n/a
 
      07-30-2010
Uno wrote:
> Gib Bogle wrote:
>
>> OK, here's the Fortran. I've removed almost all the comments to save
>> space.
>> If you uncomment the line in MWC():
>> c = c_this_works
>> you'll see how to reproduce the C code results.

>
> Alright, thx, gib, looks like there's a couple rough corners:
>
> $ gfortran geo1.f90 -o out
> geo1.f90:63.32:
>
> unsigned_result = 3740121002
> 1
> Error: Integer too big for its kind at (1). This check can be disabled
> with the option -fno-range-check
>
> So, do I ask for something bigger than the default integer, use the
> iso_c_binding, or disable the warning?
>
> I think of 3.5 billion as a value that busts types in C. I would have
> thought that was going on except for the second error:
>
> geo1.f90:72.32:
>
> unsigned_result = 2224631993
> 1
> Error: Integer too big for its kind at (1). This check can be disabled
> with the option -fno-range-check
>
>
> Then there's this. It looks like the bitshifting capabilities that
> fortran and C have, but neither MR&C nor my slightly-inappropriate C
> reference have a SHIFTL. Do you have a definition for it?
> geo1.f90:55.16:
>
> xs = XOR(xs,SHIFTL(xs,13))
> 1
> Error: Function 'shiftl' at (1) has no IMPLICIT type
>
> Beautiful night with the cicadas humming. I don't know that I've ever
> seen one which makes them, aesthetically, my favorite bug. Cheers,


Uno, the only reason for putting those lines in with unsigned_integer is to show
that the result that you see when x is written is actually the same as the
C-generated number. The Intel Fortran compiler is happy with the code, not even
issuing a warning. I suggest using -fno-range-check, or alternatively subtract
2^31 from the big numbers (I think). Or you could just take my word for it.

I didn't realize that SHIFTL and SHIFTR were not standard Fortran. You can use
ISHFT:

result = ISHFT (i,shift)

i (Input) Must be of type integer.

shift (Input) Must be of type integer. The absolute value for shift must be less
than or equal to BIT_SIZE( i).

Results
The result type is the same as i. The result has the value obtained by shifting
the bits of i by shift positions. If shift is positive, the shift is to the
left; if shift is negative, the shift is to the right. If shift is zero, no
shift is performed.

Bits shifted out from the left or from the right, as appropriate, are lost.
Zeros are shifted in from the opposite end.

ISHFT with a positive shift can also be specified as LSHIFT (or LSHFT). ISHFT
with a negative shift can also be specified as RSHIFT (or RSHFT)with | shift |.

I'm very surprised to hear you say you've never seen a cicada. In the summer
they are impossible to miss here. Sparrows catch them on the wing. But what
you are hearing at night are not cicadas, I think. As far I know they are
active only in daylight hours. The night shift (here) is taken by crickets,
whose song I find very musical. You don't get to see them much. They tend to
hide in cracks in the ground, which open up in the dry season. Sometimes
crickets come inside, black ones here.

 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      07-30-2010
In comp.lang.fortran Gib Bogle <> wrote:
(snip)

>> unsigned_result = 3740121002

1
>> Error: Integer too big for its kind at (1). This check can be disabled
>> with the option -fno-range-check


You can subtract 2**32, or find two values to IOR together.
(And remember that this is all twos-complement specific.)

-- glen
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      07-30-2010
In comp.lang.fortran orz <> wrote:
(snip)

> If an RNG has a large number of possible states (say, 2^192) then it's
> unlikely for any two runs to EVER reach identical states by any means
> other than starting from the same seed. This RNG has a period vastly
> in excess of "large", so for practical purposes it just won't
> happen.


That is true, but there are some assumptions. Since the
standard doesn't say much at all about the underlying algorithm,
it isn't so reliable a statement as it seems.

For one, many PRNGs have poor performance with some starting seeds,
or poor performance for some cycles, especially as viewed from
the low bits.

> However, because this is a combination of multiple component RNGs
> there is the related issue of overlapping within the period of one or
> more component RNGs. MWC has essentially zero chance of being in an
> identical state for any reason other than identical seeding, but both
> XS and CNG will frequently have identical states in two different
> runs. There does not seem to be any detectable correlation however
> unless (either MWC or (both XS and CNG)) have identical states between
> two different runs. Which is rather uncommon, though it does happen
> often enough to see it in the real world on occasion.


My previous comments were regarding the Fortran standard
RANDOM_SEED routine. I haven't thought about the seeding of
the RNG recently mentioned here.

-- glen
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
A potentially dangerous querystring ... [ValidateRequest] Boris ASP .Net 5 04-17-2004 05:22 PM
A potentially dangerous Request.Form value was detected from the client amit ASP .Net 1 02-26-2004 09:47 PM
Why Getting 'A Potentially Dangerous Request...' Error? Anil Kripalani ASP .Net 2 02-25-2004 06:39 PM
A potentially dangerous Request.Form Alex Munk ASP .Net 2 12-17-2003 09:11 AM
Potentially Massive Internet Attack Starts Today =?iso-8859-1?Q?Frisbee=AE_MCNGP?= MCSE 14 08-26-2003 02:12 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57