There are 10 kinds of people in this world....Those who understand binary and those who don't.

Parsing Email Addresses with Regular Expressions

A lenient and strict method along with examples

Summary

Email validation is a common task in an ASP.NET page where users need to enter their email addresses. Most of the time [email protected] is an accepted email address, but you might like to do better than that.

The RegularExpressionValidator in .NET 1.1 gives a lenient Regex pattern for parsing an email address. If you don't need the strict pattern use the lenient one. It will stand the test of time better.

Here are the regular expression patterns:

Email Regex from the .NET 1.1 Regular Expression Validator
string patternLenient = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*";

string patternStrict = @"^(([^<>()[\]\\.,;:\s@\""]+" 
   + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@" 
   + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" 
   + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+" 
   + @"[a-zA-Z]{2,}))$";

Use the following method to test the regular expressions. Copy the method into the code-behind of an ASPX page with a Label control on it (lblOutput). Don't forget to add the "using" directive to your file: "using System.Text.RegularExpressions".

Test Email Regular Expressions
public void TestEmailRegex()
{
   string patternLenient = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*";
   Regex reLenient = new Regex(patternLenient);
   string patternStrict = @"^(([^<>()[\]\\.,;:\s@\""]+" 
      + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@" 
      + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" 
      + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+" 
      + @"[a-zA-Z]{2,}))$";
   Regex reStrict = new Regex(patternStrict);

   ArrayList samples = new ArrayList();
   samples.Add("joe");
   samples.Add("joe@home");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("joe-bob[at]home.com");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("joe<>[email protected]");
   samples.Add("joe&[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("[email protected]");
   samples.Add("o'[email protected]");

   string output = "<table border=1>";
   output += "<tr><td><b>Email</b></td><td><b>Pattern</b>"
      + "</td><td><b>Valid Email?</b></td></tr>";
   bool toggle = true;
   foreach (string sample in samples)
   {
      string bgcol = "white";
      if (toggle)
         bgcol = "gainsboro";
      toggle = !toggle;

      bool isLenientMatch = reLenient.IsMatch(sample);
      if (isLenientMatch)
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Lenient</td><td>Is Valid</td></tr>";
      else
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Lenient</td><td>Is NOT Valid</td></tr>";

      bool isStrictMatch = reStrict.IsMatch(sample);
      if (isStrictMatch)
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Strict</td><td>Is Valid</td></tr>";
      else
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Strict</td><td>Is NOT Valid</td></tr>";

   }
   output += "</table>";

   lblOutput.Text = output;

}

Below is the output of the test method. Most of the time the lenient and strict patterns agree. But you'll see some cases like "[email protected]" which passes the lenient test and fails the strict test. Determining what characters can be used in an email address is almost more art than science. Basically most ASCII characters are allowed, but not space, <, >, [, ], " and a few others, but in practice many mail servers and email applications have some additional restrictions of their own.

We know that the lenient pattern will often accept mails that are NOT valid, however, I think it may also reject some that ARE valid. For example ([email protected]).

In fact, an @ symbol is not even required for a serviceable email address if you're sticking to your local intranet.

So, really, when you're using a regular expression to validate an email address, you are trying to ensure that you're not going to get flaky, bizzare addresses which, while technically allowed, may be from malicious sources. Afterall, if you're a legitimate user, you're going to be sure your email address is standard and compatible with most systems.

I recently had trouble in a system with a customer having a single quote in their email address. Something like o'[email protected]. It's technically correct, but many systems won't allow it.

Output: Email Regex Samples

EmailPatternValid Email?
joeLenientIs NOT Valid
joeStrictIs NOT Valid
joe@homeLenientIs NOT Valid
joe@homeStrictIs NOT Valid
[email protected]LenientIs Valid
[email protected]StrictIs NOT Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
joe-bob[at]home.comLenientIs NOT Valid
joe-bob[at]home.comStrictIs NOT Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs NOT Valid
[email protected]StrictIs NOT Valid
[email protected]LenientIs Valid
[email protected]StrictIs NOT Valid
joe<>[email protected]LenientIs Valid
joe<>[email protected]StrictIs NOT Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
[email protected]LenientIs NOT Valid
[email protected]StrictIs Valid
[email protected]LenientIs Valid
[email protected]StrictIs Valid
o'[email protected]LenientIs Valid
o'[email protected]StrictIs Valid
 

Version: 6.0.20250620.0921