Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Riddle me this

Reply
Thread Tools

Riddle me this

 
 
Sharp Tool
Guest
Posts: n/a
 
      11-06-2005
Hi

Consider this list of numbers:

12.0
5.0
1.0
-0.1
-2.1
-124.0

what algorithm to use to remove large negative values such as -124.0?
how to determine a cutoff value that is statistically meaningful?

So far i have:

cuff off = smallest positive - smallest difference in negative pairs
= 1.0 - (2.1 - 0.1)
= 1.0 - 2.0
= -1.0

Problem is that would eliminate - 2.1!

Help appreciated.
Sharp Tool

 
Reply With Quote
 
 
 
 
Roedy Green
Guest
Posts: n/a
 
      11-06-2005
On Sun, 06 Nov 2005 08:46:17 GMT, "Sharp Tool"
<(E-Mail Removed)> wrote, quoted or indirectly quoted someone
who said :

>what algorithm to use to remove large negative values such as -124.0?
>how to determine a cutoff value that is statistically meaningful?


That is not usually a statistical question but a plausibility
question. If you are scanning data for temperatures of Honolulu you
would look at history, give yourself a safety factor, and chop below
and above a given range.

Readings for human temperatures would have a narrower range unless you
included corpses.

If your numbers fit a normal bell shaped curve, you can compute the
mean and standard deviation. Then you could throw out numbers more
than n deviations from the mean.




--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
 
Reply With Quote
 
 
 
 
Thomas Hawtin
Guest
Posts: n/a
 
      11-06-2005
Sharp Tool wrote:
>
> what algorithm to use to remove large negative values such as -124.0?
> how to determine a cutoff value that is statistically meaningful?


This newsgroup probably isn't the best place to find statisticians
(although I guess there are a few).

You could google for "outliers" or similar. "Grubbs' Test for Outliers"
seems like a step in the right direction.

Tom Hawtin
--
Unemployed English Java programmer
http://jroller.com/page/tackline/
 
Reply With Quote
 
SDB
Guest
Posts: n/a
 
      11-06-2005
"Sharp Tool" <(E-Mail Removed)> wrote in message
news:tpjbf.9940$(E-Mail Removed)...

: Consider this list of numbers:
:
: 12.0
: 5.0
: 1.0
: -0.1
: -2.1
: -124.0

: what algorithm to use to remove large negative values such as -124.0?
: how to determine a cutoff value that is statistically meaningful?

: So far i have:

: cuff off = smallest positive - smallest difference in negative pairs
: = 1.0 - (2.1 - 0.1)
: = 1.0 - 2.0
: = -1.0

How sophisticated do you need to be? Consider using the absolute value so
you don't need to worry about positive or negative numbers.

If the numbers you gave are just an example and the problem you are trying
to solve is more generic, look at a statics value called the 'Z-Score' also
sometimes called the 'Z-Value'. It computed by subtracting the number from
the mean then dividing it by the standard diviation of the set. You can
throw out value outside a range of Z-scores.

From your set, the standard deviation is 52.15.

The z-Score of the second one, 5.0 is .8603
The z-Score of the last one, -124, is .0282

In stats, the z-Score is your friend.




 
Reply With Quote
 
Sharp Tool
Guest
Posts: n/a
 
      11-07-2005

>> Sharp Tool wrote:
> >
> > what algorithm to use to remove large negative values such as -124.0?
> > how to determine a cutoff value that is statistically meaningful?

>
> This newsgroup probably isn't the best place to find statisticians
> (although I guess there are a few).
>
> You could google for "outliers" or similar. "Grubbs' Test for Outliers"
> seems like a step in the right direction.
>
> Tom Hawtin


Grubbs Test is only suitable for data that has a normal distribution - mine
does not.

Cheers
Sharp



 
Reply With Quote
 
Sharp Tool
Guest
Posts: n/a
 
      11-07-2005

"SDB" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "Sharp Tool" <(E-Mail Removed)> wrote in message
> news:tpjbf.9940$(E-Mail Removed)...
>
> : Consider this list of numbers:
> :
> : 12.0
> : 5.0
> : 1.0
> : -0.1
> : -2.1
> : -124.0
>
> : what algorithm to use to remove large negative values such as -124.0?
> : how to determine a cutoff value that is statistically meaningful?
>
> : So far i have:
>
> : cuff off = smallest positive - smallest difference in negative pairs
> : = 1.0 - (2.1 - 0.1)
> : = 1.0 - 2.0
> : = -1.0
>
> How sophisticated do you need to be? Consider using the absolute value so
> you don't need to worry about positive or negative numbers.
>
> If the numbers you gave are just an example and the problem you are trying
> to solve is more generic, look at a statics value called the 'Z-Score'

also
> sometimes called the 'Z-Value'. It computed by subtracting the number

from
> the mean then dividing it by the standard diviation of the set. You can
> throw out value outside a range of Z-scores.
>
> From your set, the standard deviation is 52.15.
>
> The z-Score of the second one, 5.0 is .8603
> The z-Score of the last one, -124, is .0282
>
> In stats, the z-Score is your friend.


My data does not fit a normal distribution.
I do not want to eliminate any positive values.
I only want to eliminate large negative values.
Z scores work with only with absolute values.
So whats the best way to go now? I'm not a statistician.

Cheers
Sharp Tool



 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      11-07-2005
On Mon, 07 Nov 2005 08:42:24 GMT, "Sharp Tool"
<(E-Mail Removed)> wrote, quoted or indirectly quoted someone
who said :

>My data does not fit a normal distribution.
>I do not want to eliminate any positive values.
>I only want to eliminate large negative values.
>Z scores work with only with absolute values.
>So whats the best way to go now? I'm not a statistician.


What distribution do they conform to?
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
 
Reply With Quote
 
Andrew Thompson
Guest
Posts: n/a
 
      11-07-2005
Sharp Tool wrote:

> My data does not fit a normal distribution.


What distribution/pattern/logic does it fit, because..

> I only want to eliminate large negative values.


...knowing that will lead to a lot closer to defining
(pinning down, and putting a value to) 'large'.

Beyond the hypothetical though, does this describe
an actual problem, or is it purely a mental exercise?
 
Reply With Quote
 
Sharp Tool
Guest
Posts: n/a
 
      11-07-2005
> Sharp Tool wrote:
>
> > My data does not fit a normal distribution.

>
> What distribution/pattern/logic does it fit, because..
>
> > I only want to eliminate large negative values.

>
> ..knowing that will lead to a lot closer to defining
> (pinning down, and putting a value to) 'large'.


A large value is one that is an obvious outlier.
I only want to eliminate large negative values.
By eye-balling the list of numbers, you can see that -124.0
doesn't 'fit in'. Wondering if there a statistical method for this.

> Beyond the hypothetical though, does this describe
> an actual problem, or is it purely a mental exercise?


Mental exercise, but i think it could be useful for removing
negative outliers.

Sharp Tool




 
Reply With Quote
 
Sharp Tool
Guest
Posts: n/a
 
      11-07-2005
> <(E-Mail Removed)> wrote, quoted or indirectly quoted someone
> who said :
>
> >My data does not fit a normal distribution.
> >I do not want to eliminate any positive values.
> >I only want to eliminate large negative values.
> >Z scores work with only with absolute values.
> >So whats the best way to go now? I'm not a statistician.

>
> What distribution do they conform to?


Random I believe.

Sharp Tool



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
A riddle with generics, cannot get it to work. HK Java 0 07-13-2005 02:50 PM
Riddle me this javascript please. Big Bill HTML 33 10-07-2004 11:04 PM
Database Results Riddle magmike HTML 18 08-08-2004 05:53 PM
Riddle Bob H Computer Support 16 05-17-2004 09:27 AM
Re: Riddle Me Computer Support 0 05-15-2004 07:48 PM



Advertisments