http://qs321.pair.com?node_id=327951


in reply to regex to validate e-mail addresses and phone numbers

If you insist upon writing your own regex, you're going to want to pay more attention to character classes, and you want to remember that not all e-mail addresses are in the format:

user@domain.com

Many e-mail addresses will contain additional dots:

user@mail.server.domain.info

I would change the first regexp to:

/^\w[\w\.\-]*\w\@\w[\w\.\-]*\w(\.\w{2,4})$/
I left the parenthetical part in place, since you're apparently trying to get the top-level domain (.edu, .com, etc.) into $1, but I took off the + at the end, since it's definitely in your way. I'm not even sure what it would do in this context. I also left intact the requirement that the user and host part should begin and end with a \w character, but may contain any number of dots or dashes. The way this reads, the minimum matching string would look like:

me@me.com

But this would also match:

my.big-name.sucks-big-time@mail.server-farm.long-domain.coop

Read up on character classes. They are your friends. Anyway, the biggest obvious remaining problem (in my opinion) with this regexp is it will still allow multiple consecutive dots or dashes. This may not be a problem in the user field, but consecutive dots are not allowed in the host field. It might be simpler to write a whole 'nother regexp to look for consecutive dots or dashes and reject based on that.