Spam Proof your Web Forms

Whether they be comment forms or contact forms, spam bots are crawling your website looking for forms to submit their links to. With comment forms this can be annoying to moderate and with contact forms, it can be annoying to receive nonsense emails designed to spam search engines. In most cases, these bots are “dumb” and are just submitting any form they find on the internet hoping something sticks. Because of this, there are several traps we can add to hopefully catch most of these bots and automatically filter out their junk.

One easy trap we can add is a fom element hidden by CSS. Humans won’t be able to see the element, but bots will find it in your form HTML and will submit something to it. Right there you will be able to distinguish a bot from a normal user and ignore the request.

<input type="text" name="email2" style="visibility: hidden;">
<?php if (!empty($_POST['email2'])) { return false; } ?>

Another check you can place is based on a “fingerprint” I detected from one seemingly popular spam bot. This particular bot would use your own domain in its email addresses. If it doesn’t make sense for someone within your organization to submit a form (such as a contact form) using your domain, you can ignore those posts.

<?php if (strpos($_POST['email'], 'YOURDOMAIN.COM') !== false) { return false; } ?>

Another trap we can use catches bots trying to hack your email forms. These bots will post mail headers to your email forms in an attempt to use your server to send out email spam. The way to catch these spam bots is to ensure that fields that should only be one line, particularly the email and subject fields, are in fact one line. This hack works by adding new lines into mail headers, and your form should be using one line text inputs for those fields. Any multi line data automatically gives away this attack.

<?php if (strpos($_POST['email'], "\n") !== false) { return false; } ?>

“Dumb” bots will also generally just crawl your pages to find forms, then submit to the form page using their own methods. More specifically, they won’t be actually submitting the page your form is on, they simply send their own request to make it seem like they did. This is useful to know for the next two traps.

The first trap is to use Javascript to insert elements into your form that the bot’s won’t know about. This is essentially the opposite of the first technique where we want bots to know about fields. In this case you can use Javascript to outut a real form field or hidden field that will be submitted when someone submits your form page, but not when a bot doesn’t use your form. The below code would go somewhere in between your <form></form> tags.

<script language="Javascript">document.write('<input type="text" name="Email"/>');</script>

The bot would never see this but you would expect to find it on your form page.

The next technique that takes advantage of the bots not actually submitting your form page is to set a session or cookie variable on the form page and verifying it in the submit code.

<?php
$strCode = md5($_SERVER['REMOTE_ADDR'] . date('YmdHms'));
session_start();
$_SESSION['check'] = $strCode;
?>

<input type="hidden" name="code" value="<?php echo $strCode; ?>"/>

Then on your form submit page you would check to see if the code field matched the session, like so:

<?php if ($_SESSION['code'] != $_POST['code']) { return false; } ?>

This method can be expanded upon so that the actual code isn’t being passed in the form, but pieces of it along with other identifying characteristics like User Agent. Your PHP code would then re-assemble the pieces to see if they matched the session. Another idea would be to set an expiration on the date, so that the form code is only valid for a certain period of time.

The following is an example of a page that employees several of the above techniques.

<?php
$strCode = md5($_SERVER['REMOTE_ADDR'] . date('YmdHms'));
session_start();
$_SESSION['code'] = $strCode;
?>

<html>

<head><title>Contact</title></head>

<body>

<form method="post" action="submitcontact.php">

<input type="hidden" name="code" value="<?php echo $strCode; ?>"/>

Name: <input type="text" name="Name"/><br/>
Email: <script language="Javascript">document.write('<input type="text" name="Email"/>');</script><br/>

<input type="text" name="Email2" style="visibility: hidden;"/>

Message: <textarea cols="40" rows="3"></textarea>

<input type="submit" value="Send"/>

</form>

</body>

</html>

And now the code that handles and validates your form:

<?php

Contact();

function Contact() {
 $strName = $_POST['Name'];
 $strEmail = $_POST['Email'];
 $strEmail2 = $_POST['Email2'];
 $strCode = $_POST['code'];
 
 // Catch Hidden Input
 if (!empty($strEmail2)) {
  return false;
 }
 
 // No Javascript Input?
 if (strlen($strEmail) < 1) {
  return false;
 }
 
 // Email Should be One Line
 if (strpos($strEmail, "\n") !== false) {
  return false;
 }

 // Email Shouldn't Be From Our Domain
 if (strpos($strEmail, 'infinetsoftware.com') !== false) {
  return false;
 }

 // Code Doesn't Match?
 if (empty($_SESSION['code']) || $_SESSION['code'] != $strCode) {
  return false;
 }
 
 // Passed All Tests, Carry on here...
}

?>

Other methods you can employ include Captchas like reCaptcha or tests like the Wordpress plugin Did You Pass Math? These methods are all traps to catch spam bots that crawl the internet and submit any form they find. If someone was determined to spam your particular site, some of these methods would easily be deafeted, but that little effort will deter almost all of the spam you will encounter.

Tags: ,
Posted in: Articles