Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Random string of digits?

Reply
Thread Tools

Random string of digits?

 
 
Chris Angelico
Guest
Posts: n/a
 
      12-25-2011
On Mon, Dec 26, 2011 at 2:46 AM, Steven D'Aprano
<(E-Mail Removed)> wrote:
> Use the Source, Luke, er, Chris
>
> If I've read the source correctly, randint() will generate sufficient
> bits of randomness to ensure that the entire int is random.
>
> http://hg.python.org/cpython/file/default/Lib/random.py


I prefer not to rely on the source. That tells me what happens, not
what's guaranteed to happen. However... bit of poking around can't
hurt. That file doesn't actually justify anything, because
random.Random() does not define getrandbits() - that, it seems, comes
from _random(); turns out that getrandbits is actually doing pretty
much the same thing I suggested:

http://hg.python.org/cpython/file/74...mmodule.c#l371

Need a 64-bit random number? Take two 32-bit numbers and concatenate.
So, it's going to be easier and clearer to just take the simple
option, since it's actually doing the same thing underneath anyway.

ChrisA
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      12-25-2011
On Mon, 26 Dec 2011 03:11:56 +1100, Chris Angelico wrote:

> On Mon, Dec 26, 2011 at 2:46 AM, Steven D'Aprano
> <(E-Mail Removed)> wrote:
>> Use the Source, Luke, er, Chris
>>
>> If I've read the source correctly, randint() will generate sufficient
>> bits of randomness to ensure that the entire int is random.
>>
>> http://hg.python.org/cpython/file/default/Lib/random.py

>
> I prefer not to rely on the source. That tells me what happens, not
> what's guaranteed to happen.


In this case, the source explicitly tells you that the API includes
support for arbitrary large ranges if you include a getrandbits() method:

Optionally, implement a getrandbits() method so that randrange()
can cover arbitrarily large ranges.

I call that a pretty strong guarantee.


> However... bit of poking around can't hurt.
> That file doesn't actually justify anything, because random.Random()
> does not define getrandbits() - that, it seems, comes from _random();
> turns out that getrandbits is actually doing pretty much the same thing
> I suggested:
>
> http://hg.python.org/cpython/file/745f9fd9856d/Modules/

_randommodule.c#l371
>
> Need a 64-bit random number? Take two 32-bit numbers and concatenate.
> So, it's going to be easier and clearer to just take the simple option,
> since it's actually doing the same thing underneath anyway.


Um, I'm not sure what you consider "the simple option" in this context. I
would hope you mean to use the high level API of randint:

# need a random number with exactly 20 decimal digits
random.randint(10**20, 10**21-1)

rather than manually assembling a 20 digit number from smaller pieces.


--
Steven
 
Reply With Quote
 
 
 
 
Serhiy Storchaka
Guest
Posts: n/a
 
      12-25-2011
25.12.11 15:48, Steven D'Aprano написав(ла):
> On Sun, 25 Dec 2011 08:30:46 -0500, Roy Smith wrote:
>> I want to create a string of 20 random digits (I'm OK with leading
>> zeros). The best I came up with is:
>> ''.join(str(random.randint(0, 9)) for i in range(20))
>> Is there something better?

> '%20d' % random.randint(0, 10**20-1)


'%020d' % random.randrange(10**20)

 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      12-25-2011
On Mon, 26 Dec 2011 03:11:56 +1100, Chris Angelico wrote:
> > I prefer not to rely on the source. That tells me what happens, not
> > what's guaranteed to happen.


Steven D'Aprano <(E-Mail Removed)> wrote:
> In this case, the source explicitly tells you that the API includes
> support for arbitrary large ranges if you include a getrandbits() method:
>
> Optionally, implement a getrandbits() method so that randrange()
> can cover arbitrarily large ranges.
>
> I call that a pretty strong guarantee.


I think you mis-understood Chris's point. The documentation is the
specification of how something behaves. If the documentation doesn't
say it, you can't rely on it. The user should never have to read the
source to know how to use a function, or what they can depend on. Now,
I'm not saying that reading the source isn't useful for a deeper
understanding, but it should be understood that any insights you glean
from doing that are strictly implementation details.

If you're saying that there are guarantees made by the implementation of
getrandbits() which are not documented, then one of two things are true:

1) It is intended that users can depend on that behavior, in which case
it's a bug in the docs, and the docs should be updated.

or

2) It is not intended that users can depend on that behavior, in which
case they would be foolish to do so.
 
Reply With Quote
 
88888 Dihedral
Guest
Posts: n/a
 
      12-25-2011
Roy Smith於 2011年12月26日星期一UTC+8上午1時41分29 寫道:
> On Mon, 26 Dec 2011 03:11:56 +1100, Chris Angelico wrote:
> > > I prefer not to rely on the source. That tells me what happens, not
> > > what's guaranteed to happen.

>
> Steven D'Aprano <(E-Mail Removed)> wrote:
> > In this case, the source explicitly tells you that the API includes
> > support for arbitrary large ranges if you include a getrandbits() method:
> >
> > Optionally, implement a getrandbits() method so that randrange()
> > can cover arbitrarily large ranges.
> >
> > I call that a pretty strong guarantee.

>
> I think you mis-understood Chris's point. The documentation is the
> specification of how something behaves. If the documentation doesn't
> say it, you can't rely on it. The user should never have to read the
> source to know how to use a function, or what they can depend on. Now,
> I'm not saying that reading the source isn't useful for a deeper
> understanding, but it should be understood that any insights you glean
> from doing that are strictly implementation details.
>
> If you're saying that there are guarantees made by the implementation of
> getrandbits() which are not documented, then one of two things are true:
>
> 1) It is intended that users can depend on that behavior, in which case
> it's a bug in the docs, and the docs should be updated.
>
> or
>
> 2) It is not intended that users can depend on that behavior, in which
> case they would be foolish to do so.


Random bit generations for RSA2048 encoding and cryptography applications
in python is simple and elegant.
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      12-26-2011
On Sun, 25 Dec 2011 12:41:29 -0500, Roy Smith wrote:

> On Mon, 26 Dec 2011 03:11:56 +1100, Chris Angelico wrote:
>> > I prefer not to rely on the source. That tells me what happens, not
>> > what's guaranteed to happen.

>
> Steven D'Aprano <(E-Mail Removed)> wrote:
>> In this case, the source explicitly tells you that the API includes
>> support for arbitrary large ranges if you include a getrandbits()
>> method:
>>
>> Optionally, implement a getrandbits() method so that randrange()
>> can cover arbitrarily large ranges.
>>
>> I call that a pretty strong guarantee.

>
> I think you mis-understood Chris's point.


And I'm afraid that you have missed my point. The above comment from the
source is from a docstring: it *is* public, official documentation. See
help(random.Random).


> The documentation is the
> specification of how something behaves. If the documentation doesn't
> say it, you can't rely on it.


A nice platitude, but not true. Documentation is often incomplete or even
inaccurate. We rely on many things that aren't documented anywhere. For
example, we can rely on the fact that

x = 3917
print x+1

will print 3918, even though that specific fact isn't documented
anywhere. Nevertheless, we can absolutely bank on it -- if it happened to
do something else, we would report it as a bug and not expect to be told
"implementation detail, will not fix". We make a number of undocumented
assumptions:

* we assume that when the documentation talks about "adding" two
numbers, it means the standard mathematical definition of addition
and not some other meaning;

* we assume that the result of such addition must be *correct*,
without that assumption being guaranteed anywhere;

* we assume that addition of two ints will return an int, as opposed
to some other numerically equal value such as a float or Fraction;

* we assume that not only will it be an int, but it will be *exactly*
an int, and not a subclass of int;

and no doubt there are others.

And by the way, in case you think I'm being ridiculously pedantic,
consider the first assumption listed above: the standard mathematical
definition of addition. That assumption is violated for floats.

py> 0.7 + 0.1 == 0.8
False


It is rather amusing the trust we put in documentation, when
documentation can be changed just as easily as code. Just because the
Fine Manual says that x+1 performs addition today, doesn't mean that it
will continue to say the same thing tomorrow. So if you trust the
language designers not to arbitrarily change the documentation, why not
trust them not to arbitrarily change the code?

Hint: as far as I can tell, nowhere does Python promise that printing
3918 will show those exact digits, but it's a pretty safe bet that Python
won't suddenly change to displaying integers in balanced-ternary notation
instead of decimal.



> The user should never have to read the
> source to know how to use a function, or what they can depend on.


Well, that's one school of thought. Another school of thought is that
documentation and comments lie, and the only thing that you can trust is
the code itself.

At the end of the day, the only thing that matters is the mindset of the
developer(s) of the software. You must try to get into their head and
determine whether they are the sort of people whose API promises can be
believed.


> Now,
> I'm not saying that reading the source isn't useful for a deeper
> understanding, but it should be understood that any insights you glean
> from doing that are strictly implementation details.


I couldn't care less about the specific details of *how* getrandbits is
used to put together an arbitrarily large random number. That "how" is
what people mean when they talk about "mere implementation details".

But the fact that getrandbits is used by randrange (as opposed to how it
is used) is not a mere implementation detail, but a public part of the
interface.


> If you're saying that there are guarantees made by the implementation of
> getrandbits() which are not documented, then one of two things are true:


The implementation of getrandbits is not documented at all. You would
have to read the C source of the _random module to find out how the
default pseudo-random number generator produces random bits. But note the
leading underscore: _random is a private implementation detail.

However the existence and use of random.Random.getrandbits is public,
documented, and guaranteed.


--
Steven
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      12-26-2011
On Mon, Dec 26, 2011 at 2:00 PM, Steven D'Aprano
<(E-Mail Removed)> wrote:
> The implementation of getrandbits is not documented at all. You would
> have to read the C source of the _random module to find out how the
> default pseudo-random number generator produces random bits. But note the
> leading underscore: _random is a private implementation detail.
>
> However the existence and use of random.Random.getrandbits is public,
> documented, and guaranteed.


It's that last line where I find disagreement between documentation
and source code.

"""Class Random can also be subclassed if you want to use a different
basic generator of your own devising: in that case, override the
random(), seed(), getstate(), and setstate() methods. Optionally, a
new generator can supply a getrandbits() method this allows
randrange() to produce selections over an arbitrarily large range."""

My reading of this is that, if I were to write my own generator, I
could provide that method and gain perfection. It's not until you dig
somewhat that you find out that the default generator actually does
provide getrandbits, meaning that the default randrange can actually
be used for arbitrarily large ranges. Yes, it IS documented, but in
separate places and somewhat as an afterthought; in the 2.x
documentation it's shown as "new in 2.4", which explains this.

Unfortunately my brain isn't really running on all cylinders at the
moment and I can't come up with a better wording, but is there some
way this could be stated up in the same top paragraph that explains
about 53-bit precision and 2**19937-1 period? That's where I got my
original information from; I didn't go down to the individual function
definitions, which is the only place that it's stated that the
Mersenne Twister does include getrandbits (and which assumes that
you've read the top section that states that Mersenne Twister is the
default implementation).

ChrisA
 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      12-26-2011
In article <4ef7e337$0$29973$c3e8da3$(E-Mail Removed) om>,
Steven D'Aprano <(E-Mail Removed)> wrote:

> And I'm afraid that you have missed my point. The above comment from the
> source is from a docstring: it *is* public, official documentation. See
> help(random.Random).


When you wrote, "the source explicitly tells you..." the natural
assumption I made was "something in the source that's not part of the
documentation" (i.e. some comment).

> > The documentation is the specification of how something behaves.
> > If the documentation doesn't say it, you can't rely on it.


> A nice platitude, but not true. Documentation is often incomplete or even
> inaccurate.


Of course it is. Those things constitute doc bugs which need to get
fixed. The fact that the specification is flawed does not change the
fact that it *is* the specification.
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      12-26-2011
On Mon, Dec 26, 2011 at 3:17 PM, Roy Smith <(E-Mail Removed)> wrote:
> Of course it is. *Those things constitute doc bugs which need to get
> fixed. *The fact that the specification is flawed does not change the
> fact that it *is* the specification.


Also, the specification is what can be trusted across multiple
implementations of Python - the source can't. I don't intend to get
the source code to five Pythons to ascertain whether getrandbits() is
provided in the default random.Random() of each of them. Now, in this
particular instance, it IS stated in the docs - just down in the
individual function descriptions, which I hadn't read - so I almost
certainly CAN trust that across all platforms. But in the general
case, I would not want to make firm statements on the basis of
implementation details.

ChrisA
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
random.random(), random not defined!? globalrev Python 4 04-20-2008 08:12 AM
Random "The IListSource does not contain any datasources" and more (Crashing a live site at random, twice a week or so) Lars-Erik Aabech ASP .Net 8 04-28-2005 07:52 AM
Random not really random... Maziar Aflatoun ASP .Net 4 08-05-2004 01:26 AM
Random NOt random? Darren Clark ASP .Net 3 06-24-2004 05:23 PM



Advertisments