Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Skewed Random Number

Reply
Thread Tools

Skewed Random Number

 
 
shuchalle@hotmail.com
Guest
Posts: n/a
 
      09-19-2005
Hello,

I am writing a program in Java. I have following requirements.

We have large data set points whose value will range from 100 to 1500.

We need to select 10% of dataset points randomly. So if there were
40000 data points - we need to select 4000 points on random basis.

Now you say - well that's easy. Well - here is the twist.

We need to "skew" the randomness so that more points are selected
towards higher number as in near to 1500 and less points are selected
toward lower end of spectrum that is 100. But all in all -still 10% (or
4000 out of 40000 dataset points) of total points out of data points
should be selected.

We can use some sort of "logarithmic skewage" - if there is such a
word.

Any clever ideas or hints would be much appreciated!

Regards,

AZXML

 
Reply With Quote
 
 
 
 
Oliver Wong
Guest
Posts: n/a
 
      09-19-2005

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> Hello,
>
> I am writing a program in Java. I have following requirements.
>
> We have large data set points whose value will range from 100 to 1500.
>
> We need to select 10% of dataset points randomly. So if there were
> 40000 data points - we need to select 4000 points on random basis.
>
> Now you say - well that's easy. Well - here is the twist.
>
> We need to "skew" the randomness so that more points are selected
> towards higher number as in near to 1500 and less points are selected
> toward lower end of spectrum that is 100. But all in all -still 10% (or
> 4000 out of 40000 dataset points) of total points out of data points
> should be selected.
>
> We can use some sort of "logarithmic skewage" - if there is such a
> word.
>
> Any clever ideas or hints would be much appreciated!


Umm... what's the problem exactly? You seem to be under the assumption
that all random distributions are uniform; that's not the case.

I don't know what kind of distribution you want, but Poisson
distribution, Beta distribution with A=1;B=3, Gamma distribution with
(k=1;theta=2), exponential distrubiton, and many others all have the
property that one end of the spectrum is more likely than others.

Why don't you take a look at
http://en.wikipedia.org/wiki/Categor..._distributions

- Oliver


 
Reply With Quote
 
 
 
 
Thomas Fritsch
Guest
Posts: n/a
 
      09-19-2005
<(E-Mail Removed)> wrote:
> We have large data set points whose value will range from 100 to 1500.
>
> We need to select 10% of dataset points randomly. So if there were
> 40000 data points - we need to select 4000 points on random basis.
>
> Now you say - well that's easy. Well - here is the twist.
>
> We need to "skew" the randomness so that more points are selected
> towards higher number as in near to 1500 and less points are selected
> toward lower end of spectrum that is 100. But all in all -still 10% (or
> 4000 out of 40000 dataset points) of total points out of data points
> should be selected.
>
> We can use some sort of "logarithmic skewage" - if there is such a
> word.
>
> Any clever ideas or hints would be much appreciated!

A simple method for generating a random number, which favors large values a
bit, could be:
double x = Math.random(); // uniform distributed in [0,1]
x = Math.pow(x, 0.9); // skewed distributed in [0,1]
x = 1400 * x + 100; // skewed distributed in [100,1500]

--
"TFritsch$t-online:de".replace(':','.').replace('$','@')


 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      09-19-2005
On 19 Sep 2005 08:56:11 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote or quoted :

>Any clever ideas or hints would be much appreciated!


you need a course in elementary probability and statistics.

Here are some hints.

See http://mindprod.com/jgloss/randomnumbers.htmls

here is how nextGaussian works to produce a normal bell shaped curve
distribution:

synchronized public double nextGaussian() {
if (haveNextNextGaussian) {
haveNextNextGaussian = false;
return nextNextGaussian;
} else {
double v1, v2, s;
do {
v1 = 2 * nextDouble() - 1; // between -1.0 and
1.0
v2 = 2 * nextDouble() - 1; // between -1.0 and
1.0
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = Math.sqrt(-2 * Math.log(s)/s);
nextNextGaussian = v2 * multiplier;
haveNextNextGaussian = true;
return v1 * multiplier;
}
}

It works by taking two random doubles.

Another common distribution is called Poisson.

You need to be more precise about just how the elements are skewed
more toward the high end before you can come up with a formula to skew
them.

Here is the general idea of how you can do this.

1. scale your random number 0..1 over a more interesting domain of a
function with a simple multiplication.

2. crank it through some non-linear formula, e.g. x squared, sqrt,
exp, log, log base n, x^n, a polynomial, a chebychev polynomial,
parabola,... doing this to exp(x) for example will result in points
being dense at the low end and sparse at the high end.

3. scale it back into suitable range with a multiplication.

Different formulae will give you different skewings. If you don't
have a particular mathematical model you need, just pick a formula
that satisfies you intuitively. Graph the function and the
distribution.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
random.random(), random not defined!? globalrev Python 4 04-20-2008 08:12 AM
Select skewed when using overflow help HTML 2 04-13-2006 04:30 AM
Image getting skewed apoorv C++ 1 02-14-2005 01:05 PM
Printed 8x10's are a little skewed Swingman Digital Photography 5 11-16-2004 12:03 AM



Advertisments