Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > ideal interface for Random Number Generators?

Reply
Thread Tools

ideal interface for Random Number Generators?

 
 
Öö Tiib
Guest
Posts: n/a
 
      06-08-2010
On Jun 8, 11:40*am, orz <cdh...@gmail.com> wrote:
>
> template-heavy:
> I suspect we just have different standards of template-heavy.


Oh. Yes, everybody have. I am also worried when heavy template
wizardry is used. In Boost.Random it seems just to be straightforward
for to parametrize and combine distribution and engine classes compile
time. That is classical usage of templates and i feel comfortable with
such. The major effect of such templates is that it produces lighter
results especially at low end and is still efficient.

Typical user is usally fine with maybe one or two combinations (so he
#includes and gets binary code only for these). For a power user who
simulates natural processes the virtual calls (or even callbacks)
between distribution and engine may be too slow.

> adding a generator to Boost:
> I suppose I could propose a generator or two to Boost, but there's no
> way they would be interested in most of this package, as it has lots
> of code for doing things Boost has no interest in, like statistical
> tests.


Boost libraries have tests, but usually against bugs and not for
statistics. With your experience you probably would be of great
contributor even if you simply evaluated the efficiency of the
templates and compared it with your functions.

> interface features missing from Boost / TR1:
> I expect they'll support all the basic features. *Although... I can't
> actually find any kind of serialization / save & restore type
> functionality in their interface... probably an omission they'll fix
> sometime soon, or maybe it's there and I'm just not seeing it.


The pseudo-random-generators are all seedable and streamable in Boost.
"streamable" means these have "ostream << engine" for serialization
and "istream >> engine" for deserialization. These ara usually enough
to save/restore, copy/paste or undo/redo.
 
Reply With Quote
 
 
 
 
Jorgen Grahn
Guest
Posts: n/a
 
      06-08-2010
["Followup-To:" header set to comp.lang.c++.]

On Mon, 2010-06-07, aruzinsky wrote:

[top-posting fixed]

> On Jun 7, 4:03*am, orz <cdh...@gmail.com> wrote:
>> I am currently in the process of putting together a C++ library that
>> includes a variety of (pseudo-)random number generators (RNGs) and
>> related functionality, and I am interested in what people think the
>> best interface to an individual RNG instance would be. *The current
>> interface has each RNG implemented as a class or template class with
>> the following methods:

....

> Too verbose. It would be easier for me to write my own random number
> generator than read your instructions on how to use yours. And, then
> I would also know whom to blame when something goes wrong.


Uh, you *do* realize that creating high-quality PRNGs is a very tricky
task which requires lots of different skills, right?

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
 
 
 
Jorgen Grahn
Guest
Posts: n/a
 
      06-08-2010
["Followup-To:" header set to comp.lang.c++.]

On Mon, 2010-06-07, orz wrote:
> I am currently in the process of putting together a C++ library that
> includes a variety of (pseudo-)random number generators (RNGs) and
> related functionality, and I am interested in what people think the
> best interface to an individual RNG instance would be. The current
> interface has each RNG implemented as a class or template class with
> the following methods:
>
> //====== random number generating methods


....

> Uint32 randi ( Uint32 max );
> Uint32 randi ( Uint32 min, Uint32 max ) {return randi(max-min) + min;}
> //random integer - returns a uniform random integer less than "max"
> and no less than "min".


That is what I want in 99.9% of the cases. Except:

- Don't call it "max" if it's actually not in the range of the function.
If randi(4) can return 0--3, then 3 is max, not 4.
- Don't use home-made types! That Uint32 is a certain showstopper for
me. uint32_t has been semi-standard for over ten years.

Also, the objects should be copyable.

....
> I'm not at all sure about having the maximum values for randi
> exclusive but the minimum values inclusive. Possibly randi (and
> randi_fast and randli) should be changed to treat both max and min as
> inclusive.


Oh, here you discussed that. I'm so used to the [begin, end) idiom that
exclusive seems like the sane choice.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      06-09-2010
On Jun 8, 5:00*am, Öö Tiib <oot...@hot.ee> wrote:
> On Jun 8, 11:40*am, orz <cdh...@gmail.com> wrote:
> > template-heavy:
> > I suspect we just have different standards of template-heavy.

>
> Oh. Yes, everybody have. I am also worried when heavy template
> wizardry is used. In Boost.Random it seems just to be straightforward
> for to parametrize and combine distribution and engine classes compile
> time. That is classical usage of templates and i feel comfortable with
> such. The major effect of such templates is that it produces lighter
> results especially at low end and is still efficient.
>
> Typical user is usally fine with maybe one or two combinations (so he
> #includes and gets binary code only for these). For a power user who
> simulates natural processes the virtual calls (or even callbacks)
> between distribution and engine may be too slow.


I downloaded the latest Boost and glanced through it, and maybe I
spoke too soon on that being too template-heavy. The way they handle
distributions is lighter-weight than I thought from glancing at the
docs. All in all it's not really that different than my setup aside
from the detail that all distributions (except that native to the RNG
engine) are separate Boost, while only non-uniform ones are in mine,
and they're willing to incur a tiny bit of runtime overhead to allow
persistent RNG-distribution pairings.

> > adding a generator to Boost:
> > I suppose I could propose a generator or two to Boost, but there's no
> > way they would be interested in most of this package, as it has lots
> > of code for doing things Boost has no interest in, like statistical
> > tests.

>
> Boost libraries have tests, but usually against bugs and not for
> statistics. With your experience you probably would be of great
> contributor even if you simply evaluated the efficiency of the
> templates and compared it with your functions.


They could start by flipping through the paper L'Ecuyer published with
TestU01 to the list of results near the end and arbitrarily picking a
few RNGs listed as both passing all statistical tests and taking less
than 1.5 seconds on an Athlon 64 2.4 GHrz. Admittedly I think all of
those are large-state RNGs (there are relatively few good small-state
RNGs, and they tend to be more obscure too), but asside from that,
there's a dozen RNGs there that meet those criteria and have no major
drawbacks.

> > interface features missing from Boost / TR1:
> > I expect they'll support all the basic features. *Although... I can't
> > actually find any kind of serialization / save & restore type
> > functionality in their interface... probably an omission they'll fix
> > sometime soon, or maybe it's there and I'm just not seeing it.

>
> The pseudo-random-generators are all seedable and streamable in Boost.
> "streamable" means these have "ostream << engine" for serialization
> and "istream >> engine" for deserialization. These ara usually enough
> to save/restore, copy/paste or undo/redo.


Ah, thanks. I see the serialization code now that I've downloaded the
package instead of browsing the online documentation. Though... it
seems oriented to serializing to a string of text describing RNG state
rather than to a binary representation as I do in my interface. Their
text serialization & deserialization will thus be slower and take more
memory, but will be (slightly) human-readable if you wanted to print a
serialized state out on the users screen. Still, 95% of users won't
use serialization, and 95% of the ones that do won't care about its
efficiency.
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      06-09-2010
On Jun 8, 5:51*am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> ["Followup-To:" header set to comp.lang.c++.]
>
> On Mon, 2010-06-07, orz wrote:
> > I am currently in the process of putting together a C++ library that
> > includes a variety of (pseudo-)random number generators (RNGs) and
> > related functionality, and I am interested in what people think the
> > best interface to an individual RNG instance would be. *The current
> > interface has each RNG implemented as a class or template class with
> > the following methods:

>
> > //====== random number generating methods

>
> ...
>
> > Uint32 randi ( Uint32 max );
> > Uint32 randi ( Uint32 min, Uint32 max ) {return randi(max-min) + min;}
> > //random integer - returns a uniform random integer less than "max"
> > and no less than "min".

>
> That is what I want in 99.9% of the cases. Except:
>
> - Don't call it "max" if it's actually not in the range of the function.
> * If randi(4) can return 0--3, then 3 is max, not 4.
> - Don't use home-made types! *That Uint32 is a certain showstopper for
> * me. *uint32_t has been semi-standard for over ten years.
>
> Also, the objects should be copyable.
>
> ...
>
> > I'm not at all sure about having the maximum values for randi
> > exclusive but the minimum values inclusive. *Possibly randi (and
> > randi_fast and randli) should be changed to treat both max and min as
> > inclusive.

>
> Oh, here you discussed that. I'm so used to the [begin, end) idiom that
> exclusive seems like the sane choice.
>
> /Jorgen
>
> --
> * // Jorgen Grahn <grahn@ *Oo *o. * . *.
> \X/ * * snipabacken.se> * O *o * .


I don't have a good name besides "max". I'm currently leaning towards
[min..max] on integers (to match the Boost / TR1 standard and because
the corner cases get a little less confusing to document if it's don
that way), but [min..max) on floating point (I prefer it that way for
technical reasons), despite the small inconsistency.
Most (all?) of the RNG objects are copyable by either copy constructor
or serialization.
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      06-09-2010
Oh, on the subject or RNGs in Boost, their MT19937 implementation
looks rather odd. It basically includes two copies of the RNG state.
It looks like a clever optimization, but I don't see anything that it
actually optimizes.
 
Reply With Quote
 
Thomas J. Gritzan
Guest
Posts: n/a
 
      06-09-2010
Am 09.06.2010 17:52, schrieb orz:
> Oh, on the subject or RNGs in Boost, their MT19937 implementation
> looks rather odd. It basically includes two copies of the RNG state.
> It looks like a clever optimization, but I don't see anything that it
> actually optimizes.


From the source file:
// The goal is to always have x(i-n) ... x(i-1) available for
// operator== and save/restore.
(with i = <next word to output> and n = <# of words in state>)

You could store x(i)...x(i+n-1) instead, but you'ld had to precalculate
them each time you do a operator==.

(f'up to c.l.c++)
--
Thomas
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      06-10-2010
On Jun 9, 1:37*pm, "Thomas J. Gritzan" <phygon_antis...@gmx.de> wrote:
> Am 09.06.2010 17:52, schrieb orz:
>
> > Oh, on the subject or RNGs in Boost, their MT19937 implementation
> > looks rather odd. *It basically includes two copies of the RNG state.
> > It looks like a clever optimization, but I don't see anything that it
> > actually optimizes.

>
> From the source file:
> * // The goal is to always have x(i-n) ... x(i-1) available for
> * // operator== and save/restore.
> (with i = <next word to output> and n = <# of words in state>)
>
> You could store x(i)...x(i+n-1) instead, but you'ld had to precalculate
> them each time you do a operator==.
>
> (f'up to c.l.c++)
> --
> Thomas


It is true that that makes efficient full equality checks a bit
easier. But...
1. Equality check is not a common operation on MT.
2. Even without the state double, full equality checks on MT instances
are not inefficient.
3. You could do mostly the same thing for equality checks with just a
single extra integer instead of 624 extras.
4. The naive approach to MT equality checks mostly works okay anyway.
Not perfectly, but I think the only ways to trip it up require either
running an instance of MT through an entire period (functionally
impossible to do without a skip-ahead type method, which is not
provided), or comparing two instances of MT originally seeded with
different seeds but then moved to the exact same position (on average
requires going through half of a full cycle, though it is possible to
do so with slightly less by way of bad luck, or a lot less if you pick
your seeds with great care).
5. The Boost random docs list MT as taking up the normal amount of
memory, not the doubled amount of memory.

Okay, now I'm just whining.
 
Reply With Quote
 
orz
Guest
Posts: n/a
 
      06-10-2010
On Jun 9, 1:50*pm, Pete Becker <p...@versatilecoding.com> wrote:
> The design of RNG serialization for TR1 includes use on non-homogeneous
> systems. Binary representations just don't work for that.


If you're talking about endianness, I was presuming that endianness
would be taken care of in the serialization code. If you're talking
about floating point formats that's a bit harder but it could
certainly be taken care of more efficiently than conversion from and
to text. Why would binary RNG state serialization be harder?
 
Reply With Quote
 
Keith H Duggar
Guest
Posts: n/a
 
      06-10-2010
On Jun 10, 1:22*pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2010-06-09 19:52:14 -1000, orz said:
>
> > On Jun 9, 1:50*pm, Pete Becker <p...@versatilecoding.com> wrote:
> >> The design of RNG serialization for TR1 includes use on non-homogeneous
> >> systems. Binary representations just don't work for that.

>
> > If you're talking about endianness, I was presuming that endianness
> > would be taken care of in the serialization code. *If you're talking
> > about floating point formats that's a bit harder but it could
> > certainly be taken care of more efficiently than conversion from and
> > to text. *Why would binary RNG state serialization be harder?

>
> Because text transfer already deals with all the issues that you're
> going to have to ferret out and manage with your hypothetical binary
> protocol.


So it seems more a choice to utilize an existing subsystem (ie
using the text serialization code rather than designing second
(binary) format serialization code) rather than a "binary just
doesn't work" situation?

After all, text is just binary with a certain format. And text
is designed not only to be portable but notably to be readable
to humans. Surely if one removes the human readable constraint
one can design a more efficient format that is still portable
across modes, machines, etc.

And indeed such schemes have been designed before and deployed
widely for example xdr.

But then why should the C++ committee be concerned with this?
If there are already libraries or we are given the ability to
overload/override as we see fit then all is well.

Related question, will floating point special values (NaN, Inf,
etc) have standardized text serialization/deserialization?

KHD
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
How do I get a random number between two random numbers? Alex Untitled Ruby 11 11-16-2009 09:45 AM
random.random(), random not defined!? globalrev Python 4 04-20-2008 08:12 AM
Ideal number of threads for CPU intensive task Kenneth P. Turvey Java 8 10-07-2005 02:39 PM
My random number is only random for the first run??? xeys_00 C++ 12 04-11-2005 03:58 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57