Blocking Form and Comment Spam with A Honeypot
Spam form submissions have become more frequent over the past several years. Bots scour the web looking for contact forms on websites, and submit bogus form fills that are annoying and potentially harmful.
For several years, we have been using the HTTPBL database at http://www.projecthoneypot.org/ to block known comment spammers on client's website. This has been very effective at minimizing spam form submissions when CAPTCHA challenges were no longer working well. As a point of reference, a CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot, and you would be familiar with this if you have ever had to "enter the characters shown" when submitting a web form.
Recently we have improved our ability to block form spam by instituting the HTTPBL block at the server level, instead of on individual websites. When a known suspect IP visits our servers they are denied access and served a 403 Forbidden page.
To get a feel for how often this occurs, look at this screen capture from our server logs. As you can see in the highlighted area, the server identified 4 HTTPBL matches in just 4 minutes:
You'll notice that the server logs show other blocks as well, these are all instances of an IP trying to do something malicious on our servers. These actions occur thousands of times per day.
To understand why the HTTPBL block occurred, simply visit the Project Honeypot website and enter one of the IPs listed (https://www.projecthoneypot.org/ip_126.96.36.199). You will see that the website returns data about suspicious IPs, and we use the Threat Rating to block any IP with a Threat Rating of greater than 30. The Threat Rating is a metric that describes how dangerous an IP is based off its observed suspicious activity.
The HTTPBL approach to blocking form spammers works well to keep thousands of bogus form submissions from occurring on our clients' websites. Occasionally new IPs that are not yet identified will slip through, but these issues are generally infrequent and quickly resolve themselves as the bad IPs get listed at Project Honeypot.