Fonz.js: a better phone validation library

This blog post turned from something exploratory into an NPM library. If you’d like to jump to the code on Github, click here. Otherwise, keep reading to see how the sausage was made.


While watching the third season of Bosch, I started thinking about how many phone numbers can there possibly be. In TV shows or movies, we often see numbers in the (555) XXX-XXXX or 555-XXXX formats, so presumably those are reserved. But the question lingers: how many assignable numbers are there (as in, what’s the upper bound on people or businesses that can have phone numbers)? Interestingly, the Internet provides no clear answer.

The easy answer is, of course:

$$10^{10} = 10,000,000,000$$

We see this naive heuristic reflected in this Yahoo answer, and this Math.SE post. But it can’t possibly be that easy. And it’s not. If we forget about country codes, a North American (± a few countries) phone number is broken up into three parts:
$$\begin{array}{ccc}
\underset{\text{Area code}}{\underbrace{ABC}} & \underset{\text{Exchange code}}{\underbrace{XYZ}} & \underset{\text{Station code}}{\underbrace{NNNN}}\end{array}$$

The Area Code

An area code \(ABC\) follows the following ruleset:
$$\begin{array}{c}
\begin{array}{cc}
A=\{2\cdots9\} & AB\neq37\\
B=\{0\cdots8\} & AB\neq96\\
C=\{0\cdots9\} & BC\neq11\\
& \vdots
\end{array}\end{array}$$

As you can imagine, there are many more exceptions. Luckily, we can download a CSV from the NANPA (North American Numbering Plan Administration) and we can sort it by NON-RESEVED and ASSIGNED. Essentially, these are the numbers that regular folks and businesses can get. All in all, according to NANPA, we are left with 417 area codes.

The Exchange Code

Compared to the area code, The exchange code \(XYZ\) follows a (slightly) simpler ruleset:

$$\begin{array}{c}
\begin{array}{cc}
X=\{2\cdots9\} & YZ\neq11\\
Y=\{0\cdots9\}\\
Z=\{0\cdots9\}\\
\\
\end{array}\end{array}$$

The Station Code

And finally, we have \(NNNN\) where \(N\) can be any digit, but, yet again, there are some exceptions:

$$\begin{array}{c}
\text{If }XYZ=555\begin{cases}
NNNN\neq\{0100\cdots0199\} & \text{Fictitious use}\\
NNNN\neq1212 & \text{Directory assistance}\\
NNNN\neq4334 & \text{National use}
\end{cases}\end{array}$$

Most sources list the possible numbers (\(XYZ–NNNN\)) in any valid area code to be:

$$8\times10^{6}\text{−}8\times10^{4}-100=7,919,900$$

But Wikipedia and everyone else is wrong. If we take into account the last two special cases of \(555–NNNN\), we are left with:

$$8\times10^{6}\text{−}8\times10^{4}-100-2=7,919,898$$

Finally, multiplying by the number of valid assignable area codes, we get:
$$7,919,898\times 417 = 3, 302, 597, 466$$

That’s over three billion assignable phone numbers!

It was supposed to end here

And that was supposed to be my post. That’s it!

Huge number, some neat math, and hey, I even showed that Wikipedia was wrong. After I wrote this blog post, I started searching around to see exactly how phone validation libraries validate phone numbers. Obviously, phone validation is a pretty big deal. It may sting when a cute girl gives you a fake number after a night out, but it really sucks if you miss out on a lead when running a business. So who does it right? Granted, even though validation of North American phones is a bit complicated, it’s nothing an intermediate programmer couldn’t write in a couple of days.

Everyone seemed to point to what seems to be the de facto best library for phone number validation: Google’s libphonenumber. Surely, they must do it right. I even found an online tool where you can test phone numbers via libphonenumber. But, to my surprise, they get many wrong.

For example, the following phone numbers should not validate:

  • 310-911-1234
  • 770-555-0150
  • 949-411-0110
  • 770-555-1212

— but they do!

Because it sounds kind of like “phone”

So I thought “Eyyy I bet I could write something up real quick…”

Fonz

And fonz.js was born: an accurate phone validation library for Node.js that tries its best to abide by NANPA standards. International support is more complicated (and Google expertly supports it — although I’m not sure of its accuracy), but it’s a start!