Bayesian Formula - probability calculation

Discussion in 'NZ Computing' started by will, Jan 25, 2004.

  1. will

    will Guest

    hi,

    can i please have some pointers to calculate the 'spamicity'?

    i've read a couple of websites, but most of them are using it in a
    programming language way, rather than a formulae.

    eg this from www.paulgraham.com

    (let ((g (* 2 (or (gethash word good) 0)))
    (b (or (gethash word bad) 0)))
    (unless (< (+ g b) 5)
    (max .01
    (min .99 (float (/ (min 1 (/ b nbad))
    (+ (min 1 (/ g ngood))
    (min 1 (/ b nbad)))))))))


    what does it mean in simple english?

    thanks

    will.
     
    will, Jan 25, 2004
    #1
    1. Advertising

  2. will

    Warwick Guest

    On Sun, 25 Jan 2004 17:34:44 +1300, will wrote:

    > hi,
    >
    > can i please have some pointers to calculate the 'spamicity'?
    >
    > i've read a couple of websites, but most of them are using it in a
    > programming language way, rather than a formulae.
    >
    > eg this from www.paulgraham.com
    >
    > (let ((g (* 2 (or (gethash word good) 0)))
    > (b (or (gethash word bad) 0)))
    > (unless (< (+ g b) 5)
    > (max .01
    > (min .99 (float (/ (min 1 (/ b nbad))
    > (+ (min 1 (/ g ngood))
    > (min 1 (/ b nbad)))))))))
    >
    >
    > what does it mean in simple english?
    >
    > thanks
    >
    > will.


    Byes formaula calculates the new probability of something being true (p')
    given an old probability (p) + some new new evidence. The new evidence is
    expressed as a probability as well. In the formula it is expressed as Py
    and Pn - probability of yes and probabilty of no.

    the formula is p' = 100 * Py * p / Py * p + Pn(100-p).

    worked example. Probability of Mainlander being a miserable bastard is
    estimated at 10% (p).
    10 new posts then suggest the odds of Mainlander being a miserable bastard
    are 90%.
    Therefore Py = 90 and Pn = 10.

    Our revised probablity of our hypothesis is then
    100 * 90 * 10 / 90 * 10 + 10*(100-10) = 50

    If you had another indicator you would make p' p and do it again.


    HTH

    Warwick
     
    Warwick, Jan 25, 2004
    #2
    1. Advertising

  3. will

    Adam Warner Guest

    Hi will,

    > hi,
    >
    > can i please have some pointers to calculate the 'spamicity'?
    >
    > i've read a couple of websites, but most of them are using it in a
    > programming language way, rather than a formulae.
    >
    > eg this from www.paulgraham.com
    >
    > (let ((g (* 2 (or (gethash word good) 0)))
    > (b (or (gethash word bad) 0)))
    > (unless (< (+ g b) 5)
    > (max .01
    > (min .99 (float (/ (min 1 (/ b nbad))
    > (+ (min 1 (/ g ngood))
    > (min 1 (/ b nbad)))))))))
    >
    >
    > what does it mean in simple english?


    Will, that is an extract of Common Lisp. (gethash word good) is looking up
    the word in the group of good words. If it's not found the second part of
    the OR is computed which then returns 0. Ditto for (gethash word bad) except
    the word is looked up in the group of bad words.

    UNLESS means to do something unless the test is true.

    MAX and MIN find the maximum or minimum of a set of numbers.

    FLOAT converts a number into a floating point representation.

    / is division.

    + is addition.

    Everything is in prefix format, e.g. (+ 1 (* 2 3)) = (+ 1 6) = 7. It's like
    function calls where the opening bracket is before the function name instead
    of directly after the function name, e.g. sqrt(x) in many other languages
    would be written as (sqrt x) in Lisp.

    This pseudocode translation may help:

    let good-entry = is the word in the group of good words?
    bad-entry = is the word in the group of bad words?
    g = if good-entry then choose the good-entry else 0
    b = if bad-entry then choose the bad-entry else 0
    unless (g+b) < 5
    max(0.01
    min(0.99 float[ min(1 b/nbad)
    --------------------------------
    min(1 g/ngood) + min(1 b/nbad) ]))

    Regards,
    Adam
     
    Adam Warner, Jan 25, 2004
    #3
  4. will

    will Guest


    > This pseudocode translation may help:
    >
    > let good-entry = is the word in the group of good words?
    > bad-entry = is the word in the group of bad words?
    > g = if good-entry then choose the good-entry else 0
    > b = if bad-entry then choose the bad-entry else 0
    > unless (g+b) < 5
    > max(0.01
    > min(0.99 float[ min(1 b/nbad)
    > --------------------------------
    > min(1 g/ngood) + min(1 b/nbad) ]))




    #### hmmm.. thanks. so how is the result related to the Bayesian
    formula? is the above code doing the same job as the Bayesian formula?

    thanks

    will.
     
    will, Jan 25, 2004
    #4
  5. will

    Julian Visch Guest

    will wrote:

    >>This pseudocode translation may help:
    >>
    >>let good-entry = is the word in the group of good words?
    >> bad-entry = is the word in the group of bad words?
    >> g = if good-entry then choose the good-entry else 0
    >> b = if bad-entry then choose the bad-entry else 0
    >> unless (g+b) < 5
    >> max(0.01
    >> min(0.99 float[ min(1 b/nbad)
    >> --------------------------------
    >> min(1 g/ngood) + min(1 b/nbad) ]))
    >>

    >
    >
    >
    > #### hmmm.. thanks. so how is the result related to the Bayesian
    > formula? is the above code doing the same job as the Bayesian formula?


    P(a|b)=P(b|a)P(a) is the bayesian formula in its simplest form, gets a
    ----------
    P(b)

    lot more complicated.
     
    Julian Visch, Jan 25, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. what subnet calculation to use for 70-216

    , Mar 29, 2005, in forum: Microsoft Certification
    Replies:
    3
    Views:
    1,705
    Gamer
    Apr 7, 2005
  2. Glenn

    bgp weight calculation

    Glenn, Oct 28, 2003, in forum: Cisco
    Replies:
    4
    Views:
    4,509
    Kevin Su
    Oct 29, 2003
  3. Deepak
    Replies:
    2
    Views:
    2,507
    Deepak
    Oct 27, 2003
  4. Angela Singh

    Probability or Algebra ?

    Angela Singh, Sep 13, 2006, in forum: Computer Support
    Replies:
    2
    Views:
    459
    Angela Singh
    Sep 14, 2006
  5. Evil Bastard

    Spammers using Bayesian Filtering

    Evil Bastard, Aug 26, 2003, in forum: NZ Computing
    Replies:
    17
    Views:
    519
    T.N.O.
    Sep 1, 2003
Loading...

Share This Page