Junk mail adaptive filter transferability

Discussion in 'Firefox' started by Jowah, Nov 29, 2004.

  1. Jowah

    Jowah Guest

    Hey everybody,

    I just had an idea but I am not sure if it will work. I want to install
    thunderbird on my wife's workstation because it does such a wonderful
    job of filtering out spam. She doesnt have the patience to train the
    filtering software, so I was wondering, is there a file on my system
    that I can copy into her directory that will update the filter on her
    instance of TB to work as well as mine does? Thanks in advance.

    Jowah, Nov 29, 2004
  2. Look in your profile for training.dat. Copy that from your profile to
    your wife's.

    Leonidas Jones, Nov 29, 2004
  3. "training.dat" is the file you want, but be aware that one of the great advantages
    of the adaptive filter is that it learns to recognize what *YOU* feel to be Junk
    mail, which may not be the same as what somebody else feels is Junk mail. By using
    someone else's training.dat you give up some of this advantage.
    John Thompson, Nov 30, 2004
  4. Jowah

    :: BRIAN :: Guest

    You mean some people actually like reading email messages about how to
    make their penis large and satisfy her, but are surfing for hot teens or
    MILFs alone at home with their webcams instead, but found a nice girl
    who benefited from the "green card lottery" and got one of those "50,000
    VISAS TO THE USA! Get it now." visas, and now "needs a master", and just
    in time because the "36 hours of freedom." has hit and "the best moments
    in their life happen suddenly" even when surfing porn anonymously with
    that "do you want to meet a girl" girl they met, and are wiping their
    tracks clean on that copy of Windows XP Pro they bought for $9, which
    they could have afforded to pay full price for anyway had they invested
    in that micro-cap stock because "we all want stocks that perform", but
    they didn't bother earning "that salary you know you deserve", and so
    now they are just trying to earn a buck to help their brother who is in
    pain and enjoy vicodin (err... I mean V'I_C,O:D+I:N) but all this
    doesn't really matter because they verified their information for the
    application that was processed and approved and are eligible for
    $400,000 with a rate of 2.1% to buy a nice house somewhere or spend at
    that website "your friend recommended you" to download "Un limited Music
    and MP 3 songs" and then listen on your iPOD you got from that "no shit
    this really works, look at my picture with it" website?

    Those email messages? ;-)
    :: BRIAN ::, Nov 30, 2004
  5. Darn it Brian, you hacked into my Inbox again, didn't you? ;)

    Leonidas Jones, Nov 30, 2004
  6. Jowah

    Jowah Guest

    Works like a charm! Thanks.
    Jowah, Dec 1, 2004
  7. If you're a doctor, you might need to get messages that contain the words
    VICODIN or VIAGRA. If you're a loan officer, you may need to get messages
    about loans and mortgages. If you're an immigration lawyer you may need to
    get messages about visas and green cards.

    John Thompson, Dec 1, 2004
  8. Jowah

    :: BRIAN :: Guest

    Possibly. But not in the combinations that spamers use those words, and
    definitely not when spelled: V'I_C,O:D+I:N, V.1.C.O.D.1.N 75O
    m''''g,Viccodin, Junk Mail Filter doesn't delete messages just because
    it contains a word...

    I'm also pretty certain immigration lawyers aren't getting legitimate
    information about visas from an email that ends in:

    emittance procyonemperor busboy popularlafayette copious
    mississippianbathroom kelvin bindleconducive brake rinehartpervert
    buckaroo draftsmancull nikko ferretlaredo inkling mansionamuse pend
    :: BRIAN ::, Dec 1, 2004
  9. Yes, if you take proper advantage of the adaptive filter. But if you
    import somebody else's filter, that dumps VICODIN as well as
    V.I.C.O.D.I.N you could have a problem. The Bayesian filter doesn't reject
    messages on specific words; it assigns each word a probability of
    spamminess and when the total score exceeds a certain threshold, it gets
    moved into the Junk folder. Most people would probably reject both VICODIN and
    V.I.C.O.D.I.N, and these tokens would likely recieve pretty much the same
    score from the Bayesian filter. A doctor or pharmacist would reject
    V.I.C.O.D.I.N but retain VICODIN, resulting in different scores for each
    token. But by using somebody else's training.dat file, you may find that
    both VICODIN and V.I.C.O.D.I.N are being rejected. For that reason, I
    think it is better to train your own filter.
    John Thompson, Dec 1, 2004
