Velocity Reviews > Java > Riddle me this

# Riddle me this

Sharp Tool
Guest
Posts: n/a

 11-06-2005
Hi

Consider this list of numbers:

12.0
5.0
1.0
-0.1
-2.1
-124.0

what algorithm to use to remove large negative values such as -124.0?
how to determine a cutoff value that is statistically meaningful?

So far i have:

cuff off = smallest positive - smallest difference in negative pairs
= 1.0 - (2.1 - 0.1)
= 1.0 - 2.0
= -1.0

Problem is that would eliminate - 2.1!

Help appreciated.
Sharp Tool

Roedy Green
Guest
Posts: n/a

 11-06-2005
On Sun, 06 Nov 2005 08:46:17 GMT, "Sharp Tool"
<(E-Mail Removed)> wrote, quoted or indirectly quoted someone
who said :

>what algorithm to use to remove large negative values such as -124.0?
>how to determine a cutoff value that is statistically meaningful?

That is not usually a statistical question but a plausibility
question. If you are scanning data for temperatures of Honolulu you
would look at history, give yourself a safety factor, and chop below
and above a given range.

Readings for human temperatures would have a narrower range unless you
included corpses.

If your numbers fit a normal bell shaped curve, you can compute the
mean and standard deviation. Then you could throw out numbers more
than n deviations from the mean.

--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Thomas Hawtin
Guest
Posts: n/a

 11-06-2005
Sharp Tool wrote:
>
> what algorithm to use to remove large negative values such as -124.0?
> how to determine a cutoff value that is statistically meaningful?

This newsgroup probably isn't the best place to find statisticians
(although I guess there are a few).

You could google for "outliers" or similar. "Grubbs' Test for Outliers"
seems like a step in the right direction.

Tom Hawtin
--
Unemployed English Java programmer
http://jroller.com/page/tackline/

SDB
Guest
Posts: n/a

 11-06-2005
"Sharp Tool" <(E-Mail Removed)> wrote in message
news:tpjbf.9940\$(E-Mail Removed)...

: Consider this list of numbers:
:
: 12.0
: 5.0
: 1.0
: -0.1
: -2.1
: -124.0

: what algorithm to use to remove large negative values such as -124.0?
: how to determine a cutoff value that is statistically meaningful?

: So far i have:

: cuff off = smallest positive - smallest difference in negative pairs
: = 1.0 - (2.1 - 0.1)
: = 1.0 - 2.0
: = -1.0

How sophisticated do you need to be? Consider using the absolute value so
you don't need to worry about positive or negative numbers.

If the numbers you gave are just an example and the problem you are trying
to solve is more generic, look at a statics value called the 'Z-Score' also
sometimes called the 'Z-Value'. It computed by subtracting the number from
the mean then dividing it by the standard diviation of the set. You can
throw out value outside a range of Z-scores.

From your set, the standard deviation is 52.15.

The z-Score of the second one, 5.0 is .8603
The z-Score of the last one, -124, is .0282

In stats, the z-Score is your friend.

Sharp Tool
Guest
Posts: n/a

 11-07-2005

>> Sharp Tool wrote:
> >
> > what algorithm to use to remove large negative values such as -124.0?
> > how to determine a cutoff value that is statistically meaningful?

>
> This newsgroup probably isn't the best place to find statisticians
> (although I guess there are a few).
>
> You could google for "outliers" or similar. "Grubbs' Test for Outliers"
> seems like a step in the right direction.
>
> Tom Hawtin

Grubbs Test is only suitable for data that has a normal distribution - mine
does not.

Cheers
Sharp

Sharp Tool
Guest
Posts: n/a

 11-07-2005

"SDB" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "Sharp Tool" <(E-Mail Removed)> wrote in message
> news:tpjbf.9940\$(E-Mail Removed)...
>
> : Consider this list of numbers:
> :
> : 12.0
> : 5.0
> : 1.0
> : -0.1
> : -2.1
> : -124.0
>
> : what algorithm to use to remove large negative values such as -124.0?
> : how to determine a cutoff value that is statistically meaningful?
>
> : So far i have:
>
> : cuff off = smallest positive - smallest difference in negative pairs
> : = 1.0 - (2.1 - 0.1)
> : = 1.0 - 2.0
> : = -1.0
>
> How sophisticated do you need to be? Consider using the absolute value so
> you don't need to worry about positive or negative numbers.
>
> If the numbers you gave are just an example and the problem you are trying
> to solve is more generic, look at a statics value called the 'Z-Score'

also
> sometimes called the 'Z-Value'. It computed by subtracting the number

from
> the mean then dividing it by the standard diviation of the set. You can
> throw out value outside a range of Z-scores.
>
> From your set, the standard deviation is 52.15.
>
> The z-Score of the second one, 5.0 is .8603
> The z-Score of the last one, -124, is .0282
>
> In stats, the z-Score is your friend.

My data does not fit a normal distribution.
I do not want to eliminate any positive values.
I only want to eliminate large negative values.
Z scores work with only with absolute values.
So whats the best way to go now? I'm not a statistician.

Cheers
Sharp Tool

Roedy Green
Guest
Posts: n/a

 11-07-2005
On Mon, 07 Nov 2005 08:42:24 GMT, "Sharp Tool"
<(E-Mail Removed)> wrote, quoted or indirectly quoted someone
who said :

>My data does not fit a normal distribution.
>I do not want to eliminate any positive values.
>I only want to eliminate large negative values.
>Z scores work with only with absolute values.
>So whats the best way to go now? I'm not a statistician.

What distribution do they conform to?
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Andrew Thompson
Guest
Posts: n/a

 11-07-2005
Sharp Tool wrote:

> My data does not fit a normal distribution.

What distribution/pattern/logic does it fit, because..

> I only want to eliminate large negative values.

...knowing that will lead to a lot closer to defining
(pinning down, and putting a value to) 'large'.

Beyond the hypothetical though, does this describe
an actual problem, or is it purely a mental exercise?

Sharp Tool
Guest
Posts: n/a

 11-07-2005
> Sharp Tool wrote:
>
> > My data does not fit a normal distribution.

>
> What distribution/pattern/logic does it fit, because..
>
> > I only want to eliminate large negative values.

>
> ..knowing that will lead to a lot closer to defining
> (pinning down, and putting a value to) 'large'.

A large value is one that is an obvious outlier.
I only want to eliminate large negative values.
By eye-balling the list of numbers, you can see that -124.0
doesn't 'fit in'. Wondering if there a statistical method for this.

> Beyond the hypothetical though, does this describe
> an actual problem, or is it purely a mental exercise?

Mental exercise, but i think it could be useful for removing
negative outliers.

Sharp Tool

Sharp Tool
Guest
Posts: n/a

 11-07-2005
> <(E-Mail Removed)> wrote, quoted or indirectly quoted someone
> who said :
>
> >My data does not fit a normal distribution.
> >I do not want to eliminate any positive values.
> >I only want to eliminate large negative values.
> >Z scores work with only with absolute values.
> >So whats the best way to go now? I'm not a statistician.

>
> What distribution do they conform to?

Random I believe.

Sharp Tool

 Thread Tools

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post HK Java 0 07-13-2005 02:50 PM Big Bill HTML 33 10-07-2004 11:04 PM magmike HTML 18 08-08-2004 05:53 PM Bob H Computer Support 16 05-17-2004 09:27 AM Me Computer Support 0 05-15-2004 07:48 PM

Advertisments