We apply Benford’s Law here at the Oregon Audits Division as part of our fraud investigations.
For those who haven’t heard of it yet, Benford’s Law is a natural phenomenon that occurs in certain data sets. Just as the Bell Curve predicts certain distribution of numbers, so does Benford’s. You can use Benford’s to detect fraudulent transactions by looking for outliers.
Benford’s Law predicts that the number 1 will occur more often as the first digit than any other number. In fact, the number 1 is about 6 times more likely to occur than the number 9 (30.1% vs. 4.6%). The law can also be applied to the first two digits and other applications, but we won’t get into that now.
So what data sets conform to Benford’s? Well there are some, like the drainage of rivers, that do not apply to auditing, but there are also plenty of financial transactions that do. First off, you want to have a dataset that has a large sample size. Ideally, over 1,000 records. This is one of the cases when 30 is a very inappropriate sample size.
Second, you want data that is not limited. ATM transactions for example are limited because there are minimum and maximum withdrawals. They also generally require increments of $20. Being limited also includes using assigned values like invoice numbers. All of the digits (1 through 9) should be possible.
The data should also ideally cross multiple orders of magnitude (e.g. 1 to 10, 10 to 100, 100 to 1,000).
Here’s a list of data that should generally conform:
- Home addresses
- Bank account balances
- Census data
- Accounting related data such as Accounts Receivables
- Transaction level data
Now that I know what data to use, how can I analyze it? With Excel of course!
1 – Load Data in Excel
2 – Calculate first digit
3 – Run Benford’s using Countif
4 – Graph
The following uses real world data that helped to convict several fraudsters in Oregon.
Screenshot of Steps 1 & 2
Using the left function, you can calculate the first digit of a number.
Screenshot of Step 3
Using the countif function, you can calculate the number of first digit in your data. You will need to calculate the percentage too. The log formula on the right is Benfords Law in numerical form.
Screenshot of Step 4
Looking at the graph, you can see that the digit 1 is overrepresented. The next step is to drill down on records that do not match Benfords. A closer examination of these records with a first digit of 1 will yield a large number of $100 transactions. Those $100 transactions were largely, if not all, fraudulent. By using Benfords you can quickly identify suspicious patterns to help detect fraud.
Benfords will lead to false positives, so do not assume that if there is an outlier it has to be fraud. Next time, how to do Benford’s in ACL and why you should use the 2-digit Benford’s test.