Validating an e-mail address

Question: How do I verify if a string of characters is a valid e-mail address?

First, you can only be sure about the second half of the e-mail address (the domain), as (in order to protect the anonymity of their users) many e-mail servers don’t give immediate responses when checked to see if the first part of the e-mail address is valid (although some will send a bounce notification at a later date, once an e-mail has been attempted).

Second, you can only TRULY verify if the domain address is accurate if the testing application has internet access.

So, without making a DNS call (or before), you can’t be absolutely sure that the user or the domain actually exists, and can never be sure if the user (and therefore the email) is an actual e-mail address.

But you can check to see if the format is valid using a regular expression. And this is where things get REALLY tricky, as how restrictive you want to be in your filtering depends on you, and while there is a defined standard, simply adhering to that standard may exclude e-mail addresses that are in use.

It seems that the most common regular expression that is suggest on the web is the following:

^[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,4}$

This will cover ALMOST all e-mail address you will run into, and will exclude obnoxious e-mails like badpirate@gmail.com.nospam

However, it will exclude a few e-mail addresses that are in use (but are probably being excluded and having trouble in lots of places:

  • kevin@yesthistldexists.museum – Yes .museum is a valid Top Level Domain
  • kevin@kevin@logichigh.com – Yes the current e-mail RFC doesn’t allow this type of e-mail address HOWEVER older RFC’s did, and so there may be some folks still using this format (and not being able to use this format in lots of other places)
  • ??@??web.jp – International characters in domains and user names are already being normalized to ascii friendly code by browsers and e-mail clients, so they are being used regularly, however if you are checking before that normalization occurs, these sorts of e-mail addresses will get tossed

Therefore, I’ve also written a super lax e-mail format checker that will catch all scenarios. This reg ex would probably be best used if you plan on checking to see if the domain exists after checking to see if the format loosely matches SOME format that COULD be in use🙂

^.+@.+.[A-Za-z]{2}[A-Za-z]*$

Finally, if you’d like to implement this in cocoa code:

BOOL NSStringIsValidEmail(NSString *checkString)
{
	NString *stricterFilterString = @"[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}"; 
	NSString *laxString = @".+@.+.[A-Za-z-]{2}[A-Za-z]*";
	NSString *emailRegex = stricterFilter ? stricterFilterString : laxString;
	NSPredicate *emailTest = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", emailRegex];
	return [emailTest evaluateWithObject:checkString];
}

14 thoughts on “Validating an e-mail address”

  1. Hey, followed this from StackOverflow… Gave you a + 1, but it turns out this doesn’t work. There are typo’s in it which make it not compile… Would you mind fixing and re-posting it? (line 3, NSString not NString) (line 4, +\. not +.) i don’t speak regex, so i assume the later was the correct change, as the needed to be escaped. 🙂 thanks, -eric

    1. Websites are hard. I used the very open validation because there are so many possibilities (and they add more every day) of valid website domains. If website validation is important to you, I would suggest attempting a DNS lookup to validate sites (though I understand that there are some people who like using their IP address as their domain, yuck).

  2. Hey, the lax string is really bad written because it missed “-” in domain,
    Correct should be:
    NSString *laxString = @”.+@.+.[A-Za-z-]{2}[A-Za-z]*”;

    Please correct, my app is public with this bug unfortunatelly

  3. @Bartosz @BadPirate: Correct me if I’m wrong, but why should there be a dash in the first two characters in the TLD? The domain is handled by the very lax .+ Everything after that concerns the TLD.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s