Velocity Reviews > RE: PEP 327: Decimal Data Type

# RE: PEP 327: Decimal Data Type

Batista, Facundo
Guest
Posts: n/a

 02-03-2004
cookedm wrote:

#- What we need for this is an interval type. 1.80 m shouldn't be stored
#- as '1.80', but as '1.80 +/- 0.005', and operations such as addition
#- and multiplication should propogate the intervals.

I think this kind of math is beyond a pure numeric data type. 1.80 is to be
represented as a numeric data type. And also 0.005.

But '1.80 +/- 0.005' should be worked in another object. Hey! These are the
benefits of OOP!

.. Facundo

Bengt Richter
Guest
Posts: n/a

 02-06-2004
On Tue, 3 Feb 2004 09:33:26 -0300, "Batista, Facundo" <(E-Mail Removed)> wrote:

>cookedm wrote:
>
>#- What we need for this is an interval type. 1.80 m shouldn't be stored
>#- as '1.80', but as '1.80 +/- 0.005', and operations such as addition
>#- and multiplication should propogate the intervals.
>
>I think this kind of math is beyond a pure numeric data type. 1.80 is to be
>represented as a numeric data type. And also 0.005.
>
>But '1.80 +/- 0.005' should be worked in another object. Hey! These are the
>benefits of OOP!
>

The key concern is _exactly_ representing the limits of an interval that is
_guaranteed to contain_ the exact value of interest. One hopes to represent
very narrow intervals, but the principle is the same irrespective of the available
computer states available to represent the end points.

E.g., integer intervals can reliably enclose 1.8 and 0.005
(with [1,2] and [0,1] respectively). Of course, [1,2] +- [0,1]
=> [0,3] gets you something less than useful for 1.8+-0.005

But choosing from available IEEE-754 floating point double states
gets you some really narrow intervals, where e.g. 1.8 can be guaranteed to
be in the closed interval including the two nearest available
exactly-representable floating point numers, namely

[1.800000000000000044408920985006261616945266723632 8125,
1.799999999999999822364316059974953532218933105468 75]

I'll leave it as an exercise to work out the exactly representable value
interval limits for 0.005 and 1.8+-0.005

The _meaning_ of numbers that are guaranteed to fall into known exact intervals
in terms of representing measurements, measurement errors, statistics of the
errors, etc. is a separate matter from keeping track of exact intervals during
computation. These concerns should not be confused, IMO, though they inevitably
arise together in thinking about computing with real-life measurement values.

Regards,
Bengt Richter

Anton Vredegoor
Guest
Posts: n/a

 02-06-2004
On 6 Feb 2004 17:03:57 GMT, http://www.velocityreviews.com/forums/(E-Mail Removed) (Bengt Richter) wrote:

>The _meaning_ of numbers that are guaranteed to fall into known exact intervals
>in terms of representing measurements, measurement errors, statistics of the
>errors, etc. is a separate matter from keeping track of exact intervals during
>computation. These concerns should not be confused, IMO, though they inevitably
>arise together in thinking about computing with real-life measurement values.

(Warning, naive hobbyist input, practicality: undefined)

One possible option would be to provide for some kind of random
rounding routine for some of the least significant bits of a floating
point value. The advantage would be that this would also be usable for
DSP-like computations that are used in music programming (volume
adjustments) or in digital video (image rotation).

I agree with the idea that exact interval tracking is important, but
perhaps this exact interval tracking should be used only during
testing and development of the code.

It could be that it would be possible to produce code with a fixed
number of least significant bits that are randomly rounded each time
some specific operation makes this necessary (not *all* computations!)
and that the floating point data would stay accurate enough for long
enough to be useable in 99.9 percent of the use cases.

Maybe we need a DSP-float instead of a decimal data type? Decimals
could be used for testing DSP-float implementations.

Anton

Tim Roberts
Guest
Posts: n/a

 02-08-2004
(E-Mail Removed) (Anton Vredegoor) wrote:
>
>One possible option would be to provide for some kind of random
>rounding routine for some of the least significant bits of a floating
>point value.

answer with 15 decimal places, but now you have non-determinism. The real
answer, I think, is getting people to understand how much of their
real-world measurements are garbage.

>The advantage would be that this would also be usable for
>DSP-like computations that are used in music programming (volume
>adjustments) or in digital video (image rotation).

Interesting. I know you were kind of talking off the top of your head, but
can you tell me what leads you to thinking that some low-order randomness
would be helpful in those particular applications?

>Maybe we need a DSP-float instead of a decimal data type? Decimals
>could be used for testing DSP-float implementations.

Can you describe what you mean by DSP-float? I'm not sure why a DSP should
treat floats any differently than an ordinary processor.
--
- Tim Roberts, (E-Mail Removed)
Providenza & Boekelheide, Inc.

Anton Vredegoor
Guest
Posts: n/a

 02-09-2004
Tim Roberts <(E-Mail Removed)> wrote:

>(E-Mail Removed) (Anton Vredegoor) wrote:
>>
>>One possible option would be to provide for some kind of random
>>rounding routine for some of the least significant bits of a floating
>>point value.

>
>answer with 15 decimal places, but now you have non-determinism. The real
>answer, I think, is getting people to understand how much of their
>real-world measurements are garbage.

Yes, but this is not a simple matter. There is some kind of order long
after strict methods become unwieldy. An intelligent rounding scheme
could harness some of this partial order to keep the computations more
accurate over a wider range of manipulations on real world data.

I'm providing some code below to show that there is order beyond
determinism. It's not very helpful in an explicit way, but it should
serve to prove the point for someone wanting to look at it for long
enough and willing to check the code for some exact deterministic
explanation, and being unable to formalize it

Also it's not bad to look at even for those not wanting to
investigate, so it might help to prevent possible tension in this
discussion a bit.

>>The advantage would be that this would also be usable for
>>DSP-like computations that are used in music programming (volume
>>adjustments) or in digital video (image rotation).

>
>Interesting. I know you were kind of talking off the top of your head, but
>can you tell me what leads you to thinking that some low-order randomness
>would be helpful in those particular applications?

There are high end digital mixers that use some kind of random
rounding to the least significant bits of their sample data in order
to make the sounds "survive" more manipulations before the effect of
the manipulations becomes audible.

In digital video with image rotation there is the problem of
determining where an object exactly is after it is rotated, because
all of its coordinate points have been rounded. A statistic approach
seems to work well here.

On a more cosmic scale the universe seems to use the same trick of
indeterminism, at least according to quantum theory and the Heisenberg
uncertainty principle. Some think that because of that the universe
itself must be a computer simulation I guess I'd better stop here
before someone mentions Douglas Adams ...

>>Maybe we need a DSP-float instead of a decimal data type? Decimals
>>could be used for testing DSP-float implementations.

>
>Can you describe what you mean by DSP-float? I'm not sure why a DSP should
>treat floats any differently than an ordinary processor.

You are right, a DSP is just like an ordinary processor, except that
it is specialized for digital signal processing operations. I guess I
got a bit carried away by thinking about a datatype that has builtin
random rounding for the least significant bits. For example by using
the Mersenne twisted random generator, it could compute a lot of
rounding bytes at once and just use them up as needed. This way it
would not slow down the computations too much.

Anton

from __future__ import division
from Tkinter import *
from random import random,choice

class Scaler:

def __init__(self, world, viewport):
(a,b,c,d), (e,f,g,h) = world, viewport
xf,yf = self.xf,self.yf = (g-e)/(c-a),(h-f)/(d-b)
wxc,wyc = (a+c)/2, (b+d)/2
vxc,vyc = (e+g)/2, (f+h)/2
self.xc,self.yc = vxc-xf*wxc,vyc-yf*wyc

def scalepoint(self, a, b):
xf,yf,xc,yc = self.xf,self.yf,self.xc,self.yc
return xf*a+xc,yf*b+yc

def scalerect(self, a, b, c, d):
xf,yf,xc,yc = self.xf,self.yf,self.xc,self.yc
return xf*a+xc,yf*b+yc,xf*c+xc,yf*d+yc

class RandomDot:

def __init__(self, master, n):
self.master = master
self.n = n
self.world = (0,0,1,1)
c = self.canvas = Canvas(master, bg = 'black',
width = 380, height = 380)
c.pack(fill = BOTH, expand = YES)
master.bind("<Configure>", self.configure)
master.bind("<Escape>", lambda
event ='ignored', m=master: m.destroy())
self.canvas.bind("<Button-1>", self.click)
self.colorfuncs = {'red'min,min),'green'min,max),
'blue'max,min), 'white'max,max)}
self.polling = False

def poll(self):
self.wriggle()
self.master.after(10, self.poll)

def click(self, event):
self.draw()

def configure(self,event):
self.scale = Scaler(self.world, self.getviewport())
self.draw()
if not self.polling:
self.polling = True
self.poll()

def draw(self):
c,sp = self.canvas,self.scale.scalepoint
c.delete('all')
funcs = self.colorfuncs
colors = funcs.keys()
for i in xrange(1000):
color = choice(colors)
a,b = sp(random(), random())
c.create_oval(a,b,a+5,b+5,fill=color,
outline = '')

def wriggle(self):
c,sp = self.canvas,self.scale.scalepoint
funcs = self.colorfuncs
x = choice(c.find_all())
color = c.itemcget(x,"fill")
f1,f2 = funcs[color]
a = f1([random() for i in xrange(self.n)])
b = f2([random() for i in xrange(self.n)])
a,b = sp(a,b)
c.coords(x,a,b,a+5,b+5)

def getviewport(self):
c = self.canvas
return (0, 0, c.winfo_width(),c.winfo_height())

if __name__=='__main__':
root = Tk()
root.title('randomdot')
app = RandomDot(root,3)
root.mainloop()

Bengt Richter
Guest
Posts: n/a

 02-09-2004
On Fri, 06 Feb 2004 20:25:21 +0100, (E-Mail Removed) (Anton Vredegoor) wrote:

>On 6 Feb 2004 17:03:57 GMT, (E-Mail Removed) (Bengt Richter) wrote:
>
>>The _meaning_ of numbers that are guaranteed to fall into known exact intervals
>>in terms of representing measurements, measurement errors, statistics of the
>>errors, etc. is a separate matter from keeping track of exact intervals during
>>computation. These concerns should not be confused, IMO, though they inevitably
>>arise together in thinking about computing with real-life measurement values.

>
>(Warning, naive hobbyist input, practicality: undefined)
>
>One possible option would be to provide for some kind of random
>rounding routine for some of the least significant bits of a floating
>point value. The advantage would be that this would also be usable for
>DSP-like computations that are used in music programming (volume
>adjustments) or in digital video (image rotation).
>

I can't spend a lot of time on this right now, but this reminds me of
a time when I tried (sucessfully IMO) to explain why feeding a simulation
system with very low noise data got more accurate results than feeding it
exact data.

The reason has to do with quantization (which was part of the system being
simulated, and which could be fed with highly accurate world-sim values plus
noise). I.e., measurements are always represented digitally with some least
significat bit representing some defined amount of a measured quantity.
This means measurement information below that is lost (or at least one bit
below that, depending the device).

The result is that a statistical mean (or other integrating process) of samples
will not be affected by the bits lost in quantizing. In the case of feeding a
simulator with accurate values multiple times, this results in the identical
biased quantized values, whereas if you add a small amount of noise, you will
get a few neighboring quantized values in some proportion, and the mean will
be a better estimate of the true (unquantized) value that a mean of quantized
values with no noise -- where all the quantized values are exactly equal and
all biased. The effect can be amplified if the input is feeding a sensitive
calculation such as the inversion of a near-singular matrix, and can make the
difference between usable and useless results.

An example using int as the quantization function:

>>> import random
>>> def simval(val, noise=1.0):

... return val + noise*random.random()
...
>>> def simulator(val, noise, trials=1000):

... return sum([int(simval(val, noise)) for i in xrange(trials)])/float(trials)
...
>>> for i in xrange(10): print simulator(1.3, 0.0),

...
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
>>> for i in xrange(10): print simulator(1.3, 1.0),

...
1.295 1.293 1.284 1.307 1.3 1.292 1.322 1.291 1.322 1.315

I suspect that the ear integrates/averages some when presented with 44.1k samples/sec,
so if uniform noise is added in below the quantization lsb of a CD, that may enhance
the perceived output sound, but some audiophile can provide the straight scoop on that.

>I agree with the idea that exact interval tracking is important, but
>perhaps this exact interval tracking should be used only during
>testing and development of the code.
>
>It could be that it would be possible to produce code with a fixed
>number of least significant bits that are randomly rounded each time
>some specific operation makes this necessary (not *all* computations!)
>and that the floating point data would stay accurate enough for long
>enough to be useable in 99.9 percent of the use cases.
>

I think you have to be careful when you do your rounding, and note
the effect on values vs populations of values and how that feeds the
next stage of processing or use.

>Maybe we need a DSP-float instead of a decimal data type? Decimals
>could be used for testing DSP-float implementations.
>

I'm not sure what DSP-float really means yet
HTH, gotta go.

Regards,
Bengt Richter