Our spunky little Honda Fit is now 13 years old, and although it has served us well, we have begun to outgrow it.
I started researching minivans, and one particular online-only auto seller seemed to have an attractive inventory.
So I decided to create an account with them and set up some filters and alerts.
But creating an account was harder than it appeared.
I entered my email address on their sign-up page. The input field immediately turned red, and the page informed me that my email address was invalid.
I encounter this problem occasionally, because I use subaddressing in my email address.
A normal email address looks like this: username@domain.com
.
When you use subaddressing, you tack on a little extra bit between your username and the @ sign: username+subaddress@domain.com
.
When the site emails you, the subaddress gets ignored for the purpose of routing the message to you, but it shows up in the "To" field of the email address. That can be tremendously useful. It allows you to set up filters to always allow or always deny messages that include a specific subaddress, and it can sometimes be used to determine whether a site has had a data breach or sold your contact information to spammers.
Although subaddressing is widely supported by email service providers, including Google's Gmail, it continues to trip up sites that perform email address validation using regular expressions, also known by the abbreviation regex.
Regex allows site developers to specify what a valid email address looks like. It can determine whether an email address contains invalid characters, or whether it's missing an @ sign.
The problem is, valid email addresses come in many forms, and many regular expressions used to validate email addresses are not able to recognize the full range of valid formats.
In the case of the auto seller's form validation, I was able to inspect their Javascript code. The portion of the email validation regex that deals with the username (the part before the @ sign) looks like this:
^[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*
The key thing about this regex snippet is that the only characters that are considered valid are alphanumeric characters, hyphens, underscores, and periods.
If a plus sign is present, as when using subaddressing, it treats the email address as invalid, even though it is perfectly valid and has been for years.
While it's frustrating that many sites continue to reject email addresses that use subaddressing, I have a Plan B.
I instead use a secondary email address without subaddressing, which is associated with newlywed.science, a domain I use to promote my book Experiments for Newlyweds.
I entered that email address into the sign-up form, but again I was informed that it was invalid.
I took another look at the email validation regex and discovered an even stupider issue.
Here is the portion of their email validation regex that deals with the domain (the part after the @ sign):
[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})$
The relevant part of this snippet is at the very end. It says that the top-level domain (TLD) must be between two and four letters long.
The most well-known TLDs, such as .com, .org, .net, .edu, and .gov, tend to have three letters. Country-specific TLDs, such as .uk or .de, tend to have two letters.
But starting around the turn of the century, more and more new TLDs came into use. For instance, .museum became active in June 2001, and .travel began accepting registrations in January 2006.
So TLDs that exceed four letters in length have been around for quite a while.
Yet anyone with an email address associated with such a TLD is locked out of creating an account with this auto seller because of this regex.
It's frustrating that major websites continue to rely on such shoddy email address validation, and this time I decided to make my frustration known.
I searched around for a contact email address on the site, but the best I could find was a Live Chat feature.
So, at 2 p.m. Eastern, I opened a Live Chat session and encouraged them to fix their terrible regex.
A robo-responder informed me that unfortunately, my message could not be handled, because Live Chat is only available between 8 a.m. and 8:45 p.m. Eastern.
Shaun Gallagher is the author of three popular science books and one silly statistics book:
He's also a software engineering manager and lives in northern Delaware with his wife and children.
Visit his portfolio site for more about his books and his programming projects.
The views expressed on this blog are his own and do not necessarily represent the views of his publishers or employer.
Adapted from a 2020 study, this web experiment tests a cognitive quirk that contributes to dogmatic worldviews.
This student guide explores three economic systems (capitalism, socialism, and distributism) and explains how distributism is different from the other two.
What if making money is not one of your top goals? And what if you happen to stumble into a high-paying career nonetheless?
How to build up and encourage code authors during the review process
A poem about all the rules you can break and the one rule you can't.