I recently needed to verify that a user’s input was a valid US social security number.  Initially, I thought it was just a simple task of verifying 3 digits, a dash, 2 numbers, another dash, and finally 4 numbers.  However, as I started reading online, I realized that each block of numbers has a significance (which I assumed) and more importantly to my task, that they have specific numeric ranges.

If you want to read the details of the format then see the links at the bottom of the post.  The gist is that as of June 25, 2011, a valid social security number needs to follow the following guidelines:

  • It must begin with 3 digits in a range from 001 to 899. The cap previously was 772, but was recently extended to 899.  They cannot be 666 (go figure), and as the range implies, they cannot be 000.   These numbers were originally associated with an assignment territory, but that is no longer the case for new numbers.  If you’re wondering about 900-999, the government holds those out for use in marketing materials, and other documentation.
  • The second set of numbers must be 2 digits in the range of 01 to 99. Similar to the first set, these can’t be 00. The government uses an odd/even number formula to determine which to use.  See the links for more information.
  • The last set of numbers is a set of 4 digits, in the range of 0001 to 9999, and also can’t be 0000.  These numbers are sequentially assigned to individual people as their unique id.

So, armed with those guidelines, knowing that there are dashes between the sets, and after seeing how a few other people did this incorrectly, I came up with the following function to validate my data:

function isValidSSN(value) 
{ 
 /* validates a US Social Security Number (SSN):
 + 3 digits from 001 to 899 (can't be 666)
 + 2 digits from 01 to 99 (based on some US govt odd/even formula)
 + 4 digits from 0001 to 9999 (assigned sequentially to individual people)
 + last section verifies that none of the number groups are all zeros
 */
 var re = '^([0-8]\d{2})([ \-]?)(\d{2})\2(\d{4})$'; 

 if (ArrayLen(ReMatch(re,value)) == 0) { return false; } 

 //remove the dashes & spaces to check for zero sequences
 var temp = Replace(Replace(value,'-','','all' ),' ','','all' );

 if(Left(temp,3) == "000" || Left(temp,3) == "666") { return false; } 
 if(Mid(temp,4,2) == "00") { return false; } 
 if(Right(temp,4) == "0000") { return false; } 

 return true; 
}

Let’s break down the Regular Expression first. The first section handles the first 3 digit number set:

^([0-8]\d{2})

The ^ says this value must begin with the following pattern, inside the parenthesis, ( ).

Then [0-8]\d{2} tells the regex engine to match any number 0 through 8, [0-8] denotes the range, followed by any 2 other digits, 0 through 9, \d{2}. The \d means any digit, and the curly braced 2 says to look for that 2 times in a row.  So this piece matches 001, 235, 575, or 899.

The next section of the pattern simply says “followed by a space or a dash or nothing”.  The ? at the end means zero or 1 occurrence of the subexpression, which is why you only have to add the space and the dash literally:

([ \-]?)

The next section is also simple, stating that the next piece should be any 2 digits.  Again, \d means any digit 0-9:

(\d{2})

The next part, the \2 tells the regex engine to look for the second part of the expression again. In this case, look for a space, dash or nothing again between the second and third number sets.

And now for the last section, we simple need any 4 digits. The only new character here is the ? which is telling the engine that this expression inside the parenthesis, must come at the end of the pattern:

(\d{4})$

Now that we have brook down the pattern, the rest of the function is just simple string functions.  First we run this pattern agains our input value.  The ReMatch() function will return an array of the matched parts. So we run that, and check to see if the returned array has any length. If not, we immediately return False and the function ends:

if (ArrayLen(ReMatch(re,value)) == 0) { return false; }

In the last part, we need to check each number set to be sure that the numbers are all zeros and that the first set is also not all sixes. First we strip out any dashes and spaces, so we are working with a 9 digit number no matter what the user entered.  Then, with simple string functions, we check the first three digits, the middle 2 digits and finally the last three numbers.  If any of these matches our criteria of “000″ or (“666″ for the first set), we return false and end the function.

If all of these steps match; if we find the patterns allowed, and none of those are all zeros, or sixes in the first set, then we simple return True, letting the user know that the value they entered is indeed a valid US Social Security based on the regulations as of June 25, 2011.

Sources:

Wikipedia: “Social Security Number”

Social Security Online: History

ColdFusion 9: Regular Expression Syntax

CFQuickDocs: ReMatch()

« »