Identifying Bot Commenters on Reddit using Benford’s Law

How can you identify non-human actors on social media? There are many ways to do this, and each method has its strengths and weaknesses. In this post, I discuss how to use Benford’s Law to identify non-human actors in user interaction logs. Application of Benford’s Law Benford’s Law is an observation that a collection of numbers that measure naturally occurring events of items tend to have a logarithm frequency distribution for the first digit of these numbers. The are several characteristics of a naturally occurring set of numbers that Benford’s Law takes advantage of: The order of magnitude of the number in the set varies uniformly The numbers vary with multiplicative fluctuations The distribution of numbers is scale invariant The exact distribution of first digits that Benford’s Law predicts is: This results in a distribution that looks like this: For a collection of numbers, if the frequency of the numbers’ first digits does not align well with the distribution shown Read More …