Go Back   Velocity Reviews > Newsgroups > Java
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

Java - java email validator?

 
Thread Tools Search this Thread
Old 08-02-2003, 05:34 AM   #1
Default java email validator?


How do I make sure the email supplied has a valid syntax in Java?


TIA.




Daniel
  Reply With Quote
Old 08-02-2003, 07:38 AM   #2
Roedy Green
 
Posts: n/a
Default Re: java email validator?
On Sat, 2 Aug 2003 00:34:06 -0400, "Daniel"
<> wrote or quoted :

>How do I make sure the email supplied has a valid syntax in Java?


You write a page of REGEX like this:

package com.mindprod.bulk;

import java.util.HashSet;
import java.util.Locale;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.mail.internet.AddressException;
import javax.mail.internet.InternetAddress;

/**
* Validate syntax of email addresses.
* Does not probe to see if mailserver exists in DNS or online.
* See MailProber for that.
* See ValidateEmailFile for an example of how to use this class.
*
* @author Roedy Green, Canadian Mind Products
* @version 1.0
* to do: check validity of & in first part of email address. Appears
in practice.
*/
public class EmailSyntaxValidator
{
private static boolean debugging = false;

/**
* Check how likely an email address is to be valid.
* The higher the number returned, the more likely the address is
valid.
* This method does not probe the internet in any way to see if
* the corresponding mail server or domain exists.
*
* @param email bare computer email address.
* e.g.
* No "Roedy Green" <> style
addresses.
* No local addresses, eg. roedy.
*
* @return 0 = email address is definitely malformed, e.g. missing
@.
* ends in .invalid
* <br>
* 1 = address does not meet one of the valid patterns
below.
* It still might be ok according to some obscure rule
in RFC 822
* Java InternetAddress accepts it as valid.
* <br>
* 2 = unknown top level domain.
* <br>
* 3 = dots at beginning or end, doubled in name.
* <br>
* 4 = address of form xxx@[209.139.205.2] using IP
* <br>
* 5 = address of form Dots _ or
- in first part of name
* <br>
* 6 = addreess of form rare, but known,
domain
* <br>
* 7 = address of form or any national
suffix.
* <br>
* 8 = address of form this national suffix,
e.g. .ca in Canada .de in Germany
* <br>
* 9 = address of form .org .net .edu .gov
..biz -- official domains
*/
public static int howValid(String email)
{
if ( email == null )
{
return 0;
}
email = email.trim().toLowerCase();
int dotPlace = email.lastIndexOf('.');
if ( 0 < dotPlace && dotPlace < email.length()-1 )
{
/* have at least x.y */
String tld = email.substring(dotPlace+1);
if ( badTLDs.contains(tld) )
{
/* deliberate invalid address */
return 0;
}
// make sure none of fragments start or end in _ or -
String[] fragments = splitter.split(email);
boolean clean = true;
for ( int i=0; i<fragments.length; i++ )
{
if ( fragments[i].startsWith("_") ||
fragments[i].endsWith("_") ||
fragments[i].startsWith("-") ||
fragments[i].endsWith("-") )
{
clean = false;
break;
}
} // end for
if ( clean )
{
Matcher m9 = p9.matcher(email);
if ( m9.matches() )
{
if ( officialTLDs.contains(tld) ) return 9;
else if ( thisCountry.equals(tld) ) return 8;
else if ( nationalTLDs.contains(tld) ) return 7;
else if ( rareTLDs.contains(tld) ) return 6;
else return 3; /* unknown tld */
}
// allow dots in name
Matcher m5 = p5.matcher(email);
if ( m5.matches() )
{
if ( officialTLDs.contains(tld) ) return 5;
else if ( thisCountry.equals(tld) ) return 5;
else if ( nationalTLDs.contains(tld) ) return 5;
else if ( rareTLDs.contains(tld) ) return 5;
else return 2; /* unknown tld */
}

// IP
Matcher m4 = p4.matcher(email);
if ( m4.matches() ) return 4; /* can't tell TLD */

// allow even lead/trail dots in name, except at start of
domain
Matcher m3 = p3.matcher(email);
if ( m3.matches() )
{
if ( officialTLDs.contains(tld) ) return 3;
else if ( thisCountry.equals(tld) ) return 3;
else if ( nationalTLDs.contains(tld) ) return 3;
else if ( rareTLDs.contains(tld) ) return 3;
else return 2; /* unknown domain */
}
} // end if clean
}
// allow even unclean addresses, and addresses without a TLD to
have a whack at passing RFC:822
try
{

/* see if InternetAddress likes it, it follows RFC:822. It
will names without domains though. */
InternetAddress.parse(email, true /* strict */);
// it liked it, no exception happened. Seems very sloppy.
return 1;
}
catch ( AddressException e )
{
// it did not like it
return 0;
}
}

// allow _ - in name, lead and trailing ones are filtered later, no
+.
static Pattern p9 =
Pattern.compile("[a-z0-9\\-_]++@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

// to split into fields
static Pattern splitter = Pattern.compile("[@\\.]");

// to allow - _ dots in name, no +
static Pattern p5 =
Pattern.compile("[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)*@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

// IP style names, no +
static Pattern p4 =
Pattern.compile("[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)*@\\[([0-9]{1,3}\\.){3}[0-9]{1,3}\\]");

// allow dots anywhere, but not at start of domain name, no +
static Pattern p3 =
Pattern.compile("[a-z0-9\\-_\\.]++@[a-z0-9\\-_]++(\\.[a-z0-9\\-_]++)++");

/**
* build a HashSet from a array of String literals.
*
* @param list array of strings
* @return HashSet you can use to test if a string is in the set.
*/
static HashSet hmaker(String[] list)
{
HashSet map = new HashSet(Math.max((int) (list.length/.75f) + 1,
16));
for ( int i=0; i<list.length; i++ )
{
map.add(list[i]);
}
return map;
}

static final String thisCountry =
Locale.getDefault().getCountry().toLowerCase();

static final HashSet officialTLDs =
hmaker(new String[]
{
"aero",
"biz",
"coop",
"com",
"edu",
"gov",
"info",
"mil",
"museum",
"name",
"net",
"org",
"pro",
});

static final HashSet rareTLDs =
hmaker(new String[]
{
"cam",
"mp3",
"agent",
"art",
"arts",
"asia",
"auction",
"aus",
"bank",
"cam",
"chat",
"church",
"club",
"corp",
"dds",
"design",
"dns2go",
"e",
"email",
"exp",
"fam",
"family",
"faq",
"fed",
"film",
"firm",
"free",
"fun",
"g",
"game",
"games",
"gay",
"ger",
"globe",
"gmbh",
"golf",
"gov",
"help",
"hola",
"i",
"inc",
"int",
"jpn",
"k12",
"kids",
"law",
"learn",
"llb",
"llc",
"llp",
"lnx",
"love",
"ltd",
"mag",
"mail",
"med",
"media",
"mp3",
"netz",
"nic",
"nom",
"npo",
"per",
"pol",
"prices",
"radio",
"rsc",
"school",
"scifi",
"sea",
"service",
"sex",
"shop",
"sky",
"soc",
"space",
"sport",
"tech",
"tour",
"travel",
"usvi",
"video",
"web",
"wine",
"wir",
"wired",
"zine",
"zoo",
});

static final HashSet nationalTLDs =
hmaker(new String[]
{
"ac",
"ad",
"ae",
"af",
"ag",
"ai",
"al",
"am",
"an",
"ao",
"aq",
"ar",
"as",
"at",
"au",
"aw",
"az",
"ba",
"bb",
"bd",
"be",
"bf",
"bg",
"bh",
"bi",
"bj",
"bm",
"bn",
"bo",
"br",
"bs",
"bt",
"bv",
"bw",
"by",
"bz",
"ca",
"cc",
"cd",
"cf",
"cg",
"ch",
"ci",
"ck",
"cl",
"cm",
"cn",
"co",
"cr",
"cu",
"cv",
"cx",
"cy",
"cz",
"de",
"dj",
"dk",
"dm",
"do",
"dz",
"ec",
"ee",
"eg",
"eh",
"er",
"es",
"et",
"fi",
"fj",
"fk",
"fm",
"fo",
"fr",
"fx",
"ga",
"gb",
"gd",
"ge",
"gf",
"gg",
"gh",
"gi",
"gl",
"gm",
"gn",
"gp",
"gq",
"gr",
"gs",
"gt",
"gu",
"gw",
"gy",
"hk",
"hm",
"hn",
"hr",
"ht",
"hu",
"id",
"ie",
"il",
"im",
"in",
"io",
"iq",
"ir",
"is",
"it",
"je",
"jm",
"jo",
"jp",
"ke",
"kg",
"kh",
"ki",
"km",
"kn",
"kp",
"kr",
"kw",
"ky",
"kz",
"la",
"lb",
"lc",
"li",
"lk",
"lr",
"ls",
"lt",
"lu",
"lv",
"ly",
"ma",
"mc",
"md",
"mg",
"mh",
"mk",
"ml",
"mm",
"mn",
"mo",
"mp",
"mq",
"mr",
"ms",
"mt",
"mu",
"mv",
"mw",
"mx",
"my",
"mz",
"na",
"nc",
"ne",
"nf",
"ng",
"ni",
"nl",
"no",
"np",
"nr",
"nu",
"nz",
"om",
"pa",
"pe",
"pf",
"pg",
"ph",
"pk",
"pl",
"pm",
"pn",
"pr",
"ps",
"pt",
"pw",
"py",
"qa",
"re",
"ro",
"ru",
"rw",
"sa",
"sb",
"sc",
"sd",
"se",
"sg",
"sh",
"si",
"sj",
"sk",
"sl",
"sm",
"sn",
"so",
"sr",
"st",
"sv",
"sy",
"sz",
"tc",
"td",
"tf",
"tg",
"th",
"tj",
"tk",
"tm",
"tn",
"to",
"tp",
"tr",
"tt",
"tv",
"tw",
"tz",
"ua",
"ug",
"uk",
"um",
"us",
"uy",
"uz",
"va",
"vc",
"ve",
"vg",
"vi",
"vn",
"vu",
"wf",
"ws",
"ye",
"yt",
"yu",
"za",
"zm",
"zw",
});

static final HashSet badTLDs =
hmaker(new String[]
{
"invalid",
"nowhere",
"noone",
});


public static void main (String[] args)
{
System.out.println(howValid("kellizer@.hotmail.com "));
}
} // end class EmailSyntaxValidator


--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.


Roedy Green
  Reply With Quote
Old 08-02-2003, 08:06 AM   #3
Gordon Beaton
 
Posts: n/a
Default Re: java email validator?
On Sat, 02 Aug 2003 06:38:06 GMT, Roedy Green wrote:
> On Sat, 2 Aug 2003 00:34:06 -0400, "Daniel"
><> wrote or quoted :


>>How do I make sure the email supplied has a valid syntax in Java?

>
> You write a page of REGEX like this:


[...]

> // allow _ - in name, lead and trailing ones are filtered later, no +.

[...]
> // to allow - _ dots in name, no +

[...]
> // IP style names, no +

[...]
> // allow dots anywhere, but not at start of domain name, no +


Why don't you think + is a valid character in the username? rfc822
allows it and several of my own email addresses use it.

/gordon

--
[ do not send me private copies of your followups ]
g o r d o n . b e a t o n @ e r i c s s o n . c o m


Gordon Beaton
  Reply With Quote
Old 08-02-2003, 10:59 PM   #4
Roedy Green
 
Posts: n/a
Default Re: java email validator?
On 2 Aug 2003 09:06:06 +0200, Gordon Beaton <> wrote or
quoted :

>Why don't you think + is a valid character in the username? rfc822


I was never able to find a definitive document in English that said
what characters were allowed. 822 is written in geek.

I found conflicting advise.

Another character I am wondering about is &. I have a friend who has
it in her email address, but I don't know how kosher it is.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.


Roedy Green
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
I have become rich in 30 days lemony-snicket A+ Certification 2 09-07-2009 03:01 PM
How to turn $6 to $16000 in few days of web crawling please@dontreply.net DVD Video 0 02-02-2007 07:25 AM
This is incredible! jc_ice DVD Video 1 08-13-2006 10:47 AM
Increase Your Wealth From Home misteek DVD Video 1 08-13-2006 10:47 AM
TURN $5 INTO $15,000 IN ONLY 30 DAYS...HERES HOW! mosquitonose@hotmail.com DVD Video 0 01-18-2006 10:32 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46