Testing Benford’s Law
- July 20, 2011
There is an odd numeric phenomenon in large datasets. If a set of values were truly random, each leading digit (1 thru 9) would appear about 11% of the time, but Benford’s Law predicts a logarithmic distribution. It doesn’t apply to all large datasets, but it does happen regularly enough that it is even used in detecting possible fraud in accounting, socio-economic data and elections. In the United States, evidence based on Benford’s law is legally admissible in criminal cases at the federal, state, and local levels.
Testing Benford’s Law is a simple experiment to see how many large, publicly accessible datasets satisfy Benford’s Law. Site creators, Jason Long and Bryce Thronton have tested a variety of large, publicly available datasets, including our Twitter Census: Twitter User by Friend Count dataset and found some striking results.


Pingback: State of Data #62 « Dr Data's Blog
Pingback: Why is it that, in many data sets, there are about six times more numbers starting with the digit 1 than with the digit 9 — a phenomenon called Benford's Law? - Quora