Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > python spam filter: random words?

Reply
Thread Tools

python spam filter: random words?

 
 
revyakin
Guest
Posts: n/a
 
      08-11-2003
I know fighting spam is like fighting global worming, but still..
50% of spam I get these days contains a random combination of letters
at the end of the subject line. Has anyone tried using that feature in
antispam filters? Since python is the only lang I am more or less
fluent in as an amature scripter, I was wondering if anyone in this
goup has comments on this idea.
Also, is it reivial make a python script filter executable from a
generic mail program like OE, or NS messenger?
I am also wondering why spammers add that stuff to their subject lines
anyway.
 
Reply With Quote
 
 
 
 
Ben Finney
Guest
Posts: n/a
 
      08-11-2003
On 10 Aug 2003 18:13:53 -0700, revyakin wrote:
> I know fighting spam is like fighting global worming, but still..

^^^^^^^^^^^^^^
Given that some spam contains e-mail worms, the typo is appropriate

> 50% of spam I get these days contains a random combination of letters
> at the end of the subject line. Has anyone tried using that feature in
> antispam filters?


My experience has been that this practice is dropping off, since
Bayesian statistical-analysis filters will glide right by random words
as "not statistically significant.

What I'm seeing now is spam with words taken straight from the "likely
good" word lists of Bayesian filters

> I am also wondering why spammers add that stuff to their subject lines
> anyway.


To defeat spam filters that check for the occurrence of a known spam
message they've seen before. As noted above, though, these are being
superseded by Bayesian word metric analysis.

--
Ben Finney
 
Reply With Quote
 
 
 
 
Sean 'Shaleh' Perry
Guest
Posts: n/a
 
      08-11-2003
On Sunday 10 August 2003 18:28, Ben Finney wrote:
> What I'm seeing now is spam with words taken straight from the "likely
> good" word lists of Bayesian filters
>


this was recently discussed on the spambayes list (the nifty Python
implementation of Paul Graham's ideas).

Apparently there are not enough uses of the word to make it statistically
interesting so spambayes ignores it. Or something like that. See the thread
there for full details.


 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      08-11-2003

"Marc Wilson" <(E-Mail Removed)> wrote in message
news(E-Mail Removed)...
> In comp.lang.python, http://www.velocityreviews.com/forums/(E-Mail Removed) (revyakin) (revyakin) wrote

in
> <(E-Mail Removed)> ::
>
> |I know fighting spam is like fighting global worming, but still..
> |50% of spam I get these days contains a random combination of

letters
> |at the end of the subject line. Has anyone tried using that feature

in
> |antispam filters?
>
> How do you detect "random" letters? You can only (programmatically)
> determine that a character sequence is "random" if it doesn't appear

in some
> sort of dictionary, and even there you have the risk of false

positives due
> to typos, acronyms etc.


Looking at successive letter pairs would go a long way. Out of the
(26+space)**2 conbinations, perhaps half occur in real words (ie, 'qx'
is a giveaway). Using triples would allow inclusion of common
three-letter acronyms as legal.

Terry J. Reedy


 
Reply With Quote
 
Marc Wilson
Guest
Posts: n/a
 
      08-12-2003
In comp.lang.python, "Terry Reedy" <(E-Mail Removed)> (Terry Reedy) wrote
in <(E-Mail Removed)>::

|> How do you detect "random" letters? You can only (programmatically)
|> determine that a character sequence is "random" if it doesn't appear
|in some
|> sort of dictionary, and even there you have the risk of false
|positives due
|> to typos, acronyms etc.
|
|Looking at successive letter pairs would go a long way. Out of the
|(26+space)**2 conbinations, perhaps half occur in real words (ie, 'qx'
|is a giveaway). Using triples would allow inclusion of common
|three-letter acronyms as legal.

For sale today on QXL.com....
--
Marc Wilson

Cleopatra Consultants Limited - IT Consultants
2 The Grange, Cricklade Street, Old Town, Swindon SN1 3HG
Tel: (44/0) 70-500-15051 Fax: (44/0) 870 164-0054
Mail: (E-Mail Removed) Web: http://www.cleopatra.co.uk
__________________________________________________ _______________
Try MailTraq at https://my.mailtraq.com/register.asp?code=cleopatra
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
random.random(), random not defined!? globalrev Python 4 04-20-2008 08:12 AM



Advertisments