Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Re: Can JavaMail detect a non-existant email address?

Reply
Thread Tools

Re: Can JavaMail detect a non-existant email address?

 
 
David Segall
Guest
Posts: n/a
 
      11-09-2003
I would like to check incoming mail to determine if the sender's
address is reachable without sending an email to the address. Is this
possible using JavaMail? If so, some sample code would be very much
appreciated. False positives are no problem and a small percentage of
false negatives is tolerable.

In case it has some bearing on the answer the application is a spam
filter that treats as spam anything that cannot be replied to.

My apologies for the cross post. It appears that nobody reads
comp.lang.java.misc.
 
Reply With Quote
 
 
 
 
Harald Hein
Guest
Posts: n/a
 
      11-09-2003
"David Segall" wrote:

> I would like to check incoming mail to determine if the sender's
> address is reachable without sending an email to the address. Is this
> possible using JavaMail?


It is not possible with ANY mailing package, due to the limitations of
the SMTP protocol.

You could check if there is an MX record for the sender domain - but
this would not guarantee that there is such a user at the domain. You
could use VRFY to check if a user exists - but you would have to
connect to the domain's mail server, and many admins have turned this
feature off because of privacy issues and abuse by spammers for
address verification.

> In case it has some bearing on the answer the application is a spam
> filter that treats as spam anything that cannot be replied to.


You are building a spam filter, but have no idea how e-mail works? Bad
idea.

 
Reply With Quote
 
 
 
 
Sudsy
Guest
Posts: n/a
 
      11-09-2003
David Segall wrote:
> I would like to check incoming mail to determine if the sender's
> address is reachable without sending an email to the address. Is this
> possible using JavaMail? If so, some sample code would be very much
> appreciated. False positives are no problem and a small percentage of
> false negatives is tolerable.
>
> In case it has some bearing on the answer the application is a spam
> filter that treats as spam anything that cannot be replied to.
>
> My apologies for the cross post. It appears that nobody reads
> comp.lang.java.misc.


There's no guaranteed way to validate the username of the claimed
sender. There's the VRFY command in RFC821 but it's no longer
reliable, primarily because of the spammers. What you CAN do is
verify the originating domain using JNDI. Following is a small
program which effectively does 'nslookup -type=MX <host> | wc -l'
but in Java. Note that an exception will be thrown if the lookup
fails (no DNS records for hostname).

import java.util.Hashtable;
import javax.naming.NamingEnumeration;
import javax.naming.NamingException;
import javax.naming.directory.DirContext;
import javax.naming.directory.InitialDirContext;
import javax.naming.directory.Attribute;
import javax.naming.directory.Attributes;
import javax.naming.directory.BasicAttribute;

public class MXLookup {

public static void main( String args[] ) {
if( args.length == 0 ) {
System.err.println( "Usage: MXLookup host [...]" );
System.exit( 12 );
}
for( int i = 0; i < args.length; i++ ) {
try {
System.out.println( args[i] + " has " +
doLookup( args[i] ) + " mail servers" );
}
catch( Exception e ) {
e.printStackTrace();
}
}
}

static int doLookup( String hostName ) throws NamingException {
Hashtable env = new Hashtable();
env.put("java.naming.factory.initial",
"com.sun.jndi.dns.DnsContextFactory");
DirContext ictx = new InitialDirContext( env );
Attributes attrs = ictx.getAttributes( hostName,
new String[] { "MX" });
Attribute attr = attrs.get( "MX" );
if( attr == null )
return( 0 );
return( attr.size() );
}
}

 
Reply With Quote
 
GaryM
Guest
Posts: n/a
 
      11-09-2003
Sudsy <(E-Mail Removed)> wrote in news:3FAE7D10.8000702
@hotmail.com:

> There's no guaranteed way to validate the username of the claimed
> sender. There's the VRFY command in RFC821 but it's no longer
> reliable, primarily because of the spammers. What you CAN do is
> verify the originating domain using JNDI. Following is a small
> program which effectively does 'nslookup -type=MX <host> | wc -l'
> but in Java. Note that an exception will be thrown if the lookup
> fails (no DNS records for hostname).


One thing to remember about this approach is the some companies write
to you from a domain that has no MX record. I think Fidelity
Investments does this (or did). Consequently if a spam test is based on
a MX record not existing you may get a unwanted false positives. Better
to also include a test for an 'A' record. By doing this you are
effectively rejecting falsified hosts.

HTH.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      11-09-2003
On Sun, 09 Nov 2003 12:09:09 GMT, David Segall
<(E-Mail Removed)> wrote or quoted :

>I would like to check incoming mail to determine if the sender's
>address is reachable without sending an email to the address. Is this
>possible using JavaMail?


I validate email addresses with some regexes and then check the
domains with MX addresses to see if they exist.

The following code is a somewhat stricter than the RFC.

package com.mindprod.bulk;

import java.util.HashSet;
import java.util.Locale;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.mail.internet.AddressException;
import javax.mail.internet.InternetAddress;

/**
* Validate syntax of email addresses.
* Does not probe to see if mailserver exists in DNS or online.
* See MailProber for that.
* See ValidateEmailFile for an example of how to use this class.
*
* @author Roedy Green, Canadian Mind Products
* @version 1.0
* to do: check validity of & in first part of email address. Appears
in practice.
*/
public class EmailSyntaxValidator
{
private static boolean debugging = false;

/**
* Check how likely an email address is to be valid.
* The higher the number returned, the more likely the address is
valid.
* This method does not probe the internet in any way to see if
* the corresponding mail server or domain exists.
*
* @param email bare computer email address.
* e.g. http://www.velocityreviews.com/forums/(E-Mail Removed)
* No "Roedy Green" <(E-Mail Removed)> style
addresses.
* No local addresses, eg. roedy.
*
* @return 0 = email address is definitely malformed, e.g. missing
@.
* ends in .invalid
* <br>
* 1 = address does not meet one of the valid patterns
below.
* It still might be ok according to some obscure rule
in RFC 822
* Java InternetAddress accepts it as valid.
* <br>
* 2 = unknown top level domain.
* <br>
* 3 = dots at beginning or end, doubled in name.
* <br>
* 4 = address of form xxx@[209.139.205.2] using IP
* <br>
* 5 = address of form (E-Mail Removed) Dots _ or
- in first part of name
* <br>
* 6 = addreess of form (E-Mail Removed) rare, but known,
domain
* <br>
* 7 = address of form (E-Mail Removed) or any national
suffix.
* <br>
* 8 = address of form (E-Mail Removed) the matching this
national suffix,
* e.g. .ca in Canada, .de in Germany
* <br>
* 9 = address of form (E-Mail Removed) .org .net .edu .gov
..biz -- official domains
*/
public static int howValid(String email)
{
if ( email == null )
{
return 0;
}
email = email.trim().toLowerCase();
int dotPlace = email.lastIndexOf('.');
if ( 0 < dotPlace && dotPlace < email.length()-1 )
{
/* have at least x.y */
String tld = email.substring(dotPlace+1);
if ( badTLDs.contains(tld) )
{
/* deliberate invalid address */
return 0;
}
// make sure none of fragments start or end in _ or -
String[] fragments = splitter.split(email);
boolean clean = true;
for ( int i=0; i<fragments.length; i++ )
{
if ( fragments[i].startsWith("_") ||
fragments[i].endsWith("_") ||
fragments[i].startsWith("-") ||
fragments[i].endsWith("-") )
{
clean = false;
break;
}
} // end for
if ( clean )
{
Matcher m9 = p9.matcher(email);
if ( m9.matches() )
{
if ( officialTLDs.contains(tld) ) return 9;
else if ( thisCountry.equals(tld) ) return 8;
else if ( nationalTLDs.contains(tld) ) return 7;
else if ( rareTLDs.contains(tld) ) return 6;
else return 3; /* unknown tld */
}
// allow dots in name
Matcher m5 = p5.matcher(email);
if ( m5.matches() )
{
if ( officialTLDs.contains(tld) ) return 5;
else if ( thisCountry.equals(tld) ) return 5;
else if ( nationalTLDs.contains(tld) ) return 5;
else if ( rareTLDs.contains(tld) ) return 5;
else return 2; /* unknown tld */
}

// IP
Matcher m4 = p4.matcher(email);
if ( m4.matches() ) return 4; /* can't tell TLD */

// allow even lead/trail dots in name, except at start of
domain
Matcher m3 = p3.matcher(email);
if ( m3.matches() )
{
if ( officialTLDs.contains(tld) ) return 3;
else if ( thisCountry.equals(tld) ) return 3;
else if ( nationalTLDs.contains(tld) ) return 3;
else if ( rareTLDs.contains(tld) ) return 3;
else return 2; /* unknown domain */
}
} // end if clean
}
// allow even unclean addresses, and addresses without a TLD to
have a whack at passing RFC:822
try
{

/* see if InternetAddress likes it, it follows RFC:822. It
will names without domains though. */
InternetAddress.parse(email, true /* strict */);
// it liked it, no exception happened. Seems very sloppy.
return 1;
}
catch ( AddressException e )
{
// it did not like it
return 0;
}
}

// allow _ - in name, lead and trailing ones are filtered later, no
+.
static Pattern p9 =
Pattern.compile("[a-z0-9\\-_]++@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

// to split into fields
static Pattern splitter = Pattern.compile("[@\\.]");

// to allow - _ dots in name, no +
static Pattern p5 =
Pattern.compile("[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)*@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

// IP style names, no +
static Pattern p4 =
Pattern.compile("[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)*@\\[([0-9]{1,3}\\.){3}[0-9]{1,3}\\]");

// allow dots anywhere, but not at start of domain name, no +
static Pattern p3 =
Pattern.compile("[a-z0-9\\-_\\.]++@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

/**
* build a HashSet from a array of String literals.
*
* @param list array of strings
* @return HashSet you can use to test if a string is in the set.
*/
static HashSet hmaker(String[] list)
{
HashSet map = new HashSet(Math.max((int) (list.length/.75f) + 1,
16));
for ( int i=0; i<list.length; i++ )
{
map.add(list[i]);
}
return map;
}

static final String thisCountry =
Locale.getDefault().getCountry().toLowerCase();

static final HashSet officialTLDs =
hmaker(new String[]
{
"aero",
"biz",
"coop",
"com",
"edu",
"gov",
"info",
"mil",
"museum",
"name",
"net",
"org",
"pro",
});

static final HashSet rareTLDs =
hmaker(new String[]
{
"cam",
"mp3",
"agent",
"art",
"arts",
"asia",
"auction",
"aus",
"bank",
"cam",
"chat",
"church",
"club",
"corp",
"dds",
"design",
"dns2go",
"e",
"email",
"exp",
"fam",
"family",
"faq",
"fed",
"film",
"firm",
"free",
"fun",
"g",
"game",
"games",
"gay",
"ger",
"globe",
"gmbh",
"golf",
"gov",
"help",
"hola",
"i",
"inc",
"int",
"jpn",
"k12",
"kids",
"law",
"learn",
"llb",
"llc",
"llp",
"lnx",
"love",
"ltd",
"mag",
"mail",
"med",
"media",
"mp3",
"netz",
"nic",
"nom",
"npo",
"per",
"pol",
"prices",
"radio",
"rsc",
"school",
"scifi",
"sea",
"service",
"sex",
"shop",
"sky",
"soc",
"space",
"sport",
"tech",
"tour",
"travel",
"usvi",
"video",
"web",
"wine",
"wir",
"wired",
"zine",
"zoo",
});

static final HashSet nationalTLDs =
hmaker(new String[]
{
"ac",
"ad",
"ae",
"af",
"ag",
"ai",
"al",
"am",
"an",
"ao",
"aq",
"ar",
"as",
"at",
"au",
"aw",
"az",
"ba",
"bb",
"bd",
"be",
"bf",
"bg",
"bh",
"bi",
"bj",
"bm",
"bn",
"bo",
"br",
"bs",
"bt",
"bv",
"bw",
"by",
"bz",
"ca",
"cc",
"cd",
"cf",
"cg",
"ch",
"ci",
"ck",
"cl",
"cm",
"cn",
"co",
"cr",
"cu",
"cv",
"cx",
"cy",
"cz",
"de",
"dj",
"dk",
"dm",
"do",
"dz",
"ec",
"ee",
"eg",
"eh",
"er",
"es",
"et",
"fi",
"fj",
"fk",
"fm",
"fo",
"fr",
"fx",
"ga",
"gb",
"gd",
"ge",
"gf",
"gg",
"gh",
"gi",
"gl",
"gm",
"gn",
"gp",
"gq",
"gr",
"gs",
"gt",
"gu",
"gw",
"gy",
"hk",
"hm",
"hn",
"hr",
"ht",
"hu",
"id",
"ie",
"il",
"im",
"in",
"io",
"iq",
"ir",
"is",
"it",
"je",
"jm",
"jo",
"jp",
"ke",
"kg",
"kh",
"ki",
"km",
"kn",
"kp",
"kr",
"kw",
"ky",
"kz",
"la",
"lb",
"lc",
"li",
"lk",
"lr",
"ls",
"lt",
"lu",
"lv",
"ly",
"ma",
"mc",
"md",
"mg",
"mh",
"mk",
"ml",
"mm",
"mn",
"mo",
"mp",
"mq",
"mr",
"ms",
"mt",
"mu",
"mv",
"mw",
"mx",
"my",
"mz",
"na",
"nc",
"ne",
"nf",
"ng",
"ni",
"nl",
"no",
"np",
"nr",
"nu",
"nz",
"om",
"pa",
"pe",
"pf",
"pg",
"ph",
"pk",
"pl",
"pm",
"pn",
"pr",
"ps",
"pt",
"pw",
"py",
"qa",
"re",
"ro",
"ru",
"rw",
"sa",
"sb",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sj",
"sk",
"sl",
"sm",
"sn",
"so",
"sr",
"st",
"sv",
"sy",
"sz",
"tc",
"td",
"tf",
"tg",
"th",
"tj",
"tk",
"tm",
"tn",
"to",
"tp",
"tr",
"tt",
"tv",
"tw",
"tz",
"ua",
"ug",
"uk",
"um",
"us",
"uy",
"uz",
"va",
"vc",
"ve",
"vg",
"vi",
"vn",
"vu",
"wf",
"ws",
"ye",
"yt",
"yu",
"za",
"zm",
"zw",
});

static final HashSet badTLDs =
hmaker(new String[]
{
"invalid",
"nowhere",
"noone",
});


public static void main (String[] args)
{
System.out.println(howValid("kellizer@.hotmail.com "));
}
} // end class EmailSyntaxValidator


--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
 
Reply With Quote
 
David Segall
Guest
Posts: n/a
 
      11-10-2003
Harald Hein <(E-Mail Removed)> wrote:

>"David Segall" wrote:
>
>> I would like to check incoming mail to determine if the sender's
>> address is reachable without sending an email to the address. Is this
>> possible using JavaMail?

>
>It is not possible with ANY mailing package, due to the limitations of
>the SMTP protocol.
>
>You could check if there is an MX record for the sender domain - but
>this would not guarantee that there is such a user at the domain. You
>could use VRFY to check if a user exists - but you would have to
>connect to the domain's mail server, and many admins have turned this
>feature off because of privacy issues and abuse by spammers for
>address verification.
>
>> In case it has some bearing on the answer the application is a spam
>> filter that treats as spam anything that cannot be replied to.

>
>You are building a spam filter, but have no idea how e-mail works? Bad
>idea.

Thanks for the information and for your caution. I will probably
ignore the caution because I find writing some code the best way of
learning about such a topic.
 
Reply With Quote
 
GaryM
Guest
Posts: n/a
 
      11-10-2003
David Segall <(E-Mail Removed)> wrote in
news:(E-Mail Removed):

> Thanks for the information and for your caution. I will probably
> ignore the caution because I find writing some code the best way of
> learning about such a topic.
>


See also http://www.paulgraham.com/spam.html for some excellent advice
on spam filtering.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error using POP3 to fetch email using Javamail moni Java 7 10-06-2007 04:33 PM
Problems sending email javamail sickness Java 3 05-17-2006 11:36 AM
Save an email in JavaMail David Segall Java 1 06-02-2005 04:59 PM
send email using JavaMail questions jrefactors@hotmail.com Java 1 02-26-2005 03:56 AM
JavaMail sending email with no body content Steve D. Perkins Java 1 12-08-2004 11:27 PM



Advertisments