Three things I did to prevent Drupal registration spam

It started slowly. Every few days I’d have a suspicious looking username pop up in my user list. I thought little of it at first — It’s free to register at Darkadia (my game cataloging service), and, beyond what’s intended, there’s nothing untoward you can do once you’ve logged in. So not a big worry.

But over time the frequency of these spam registrations increased. Now I was getting five or more per day. Finally I got curious enough and decided to try and find out exactly how many of my site’s users were genuine. I did this with a simple database query which showed me how many of my registered users had never returned after that initial registration and login. I was shocked. By my estimates, fully a third of my entire user base were spam registrations.

Something had to be done.

I suppose your first thought at this point might be to simply add a CAPTCHA. But there are few things I hate more on the web than trying to figure out what illegible numbers and letters have been smudged onto those CAPTCHA images. I was not going to inflict that pain on real people in order to thwart a registration bot.

Nope, I was going to need something a little more subtle than that. My attack on spambots would be threefold:

1. Block the offending domains

I noticed that a lot of the spam registrations were making use of email addresses from the domains big-post.com and mail-4server.com. If those had been Gmail or Hotmail addresses I might have left it at that, but as I did not have a single valid user from either of the offending domains, l felt it’d save me a lot of hassle to simply ban those domains outright. This ability is built right into Drupal. I went to the Access Rules section in the admin panel (admin/user/rules), and added a DENY rule for each of the domains. The rule configuration takes a wildcard, so I could simply enter %@big-post.com and %@mail-4server.com.

Update

Thanks to @typhonius for pointing out in the comments that the access rules functionality has been removed from the Drupal 7 core. It is now provided by the User restrictions module.

2. Filter out the bots without annoying my users

For this I turned to a great little module called Spamicide, which lays an effective honeypot trap for the spambots. It adds an input field to your registration form that is hidden from your users, but visible to spambots. When the field is filled in, the registration attempt is rejected. Spamicide supports any form on a Drupal website and, with some measures built in to avoid detection, it works a treat — in the three months since I turned it on, it’s blocked over 900 registration attempts.

3. Require email verification

Implementing the first two measures dramatically reduced the number of spam registrations I was receiving. But it didn’t quite get rid of all of them. And now that I had the bit between my teeth, I couldn’t just let it lie there. Nevertheless, requiring my visitors to verify their emails was a tough decision, because it was going to be an inconvenience. Ultimately though, I just wasn’t comfortable with the idea of having spambots with valid accounts on my site — I could well implement some change in the future that could give spammers free reign to ply their trade.

Drupal supports email verification out of the box, so if you’re happily using that you can skip the rest of this section. But what if you want users to choose their password during registration? You can’t have it both ways with Drupal running in its standard configuration — you either have email verification enabled, or you disable that to let new users choose their password. But the problem with letting them choose their password is that anyone can register using any bogus email address and now they’re a fully authenticated user of your system and, depending on how your permissions are configured, they can start spamming you to death.

Enter LoginToboggan. This module modifies Drupal’s login system in a number of ways and you can check the project page for a full rundown. What we’re interested in though, is that it allows you to require email verification AND let users choose their own passwords. It does this by assigning the new user a role with a minumum set of permissions, and automatically upgrading the user once the email address is verified. You can even set it to delete unverified accounts after a set interval.

Conclusion

Completing step 3 was the final nail in the spammers’ coffin; at least as far as Darkadia is concerned. I’m convinced the above measures will stop all spambots, and all but the most determined of spammers. And most importantly, I didn't have to penalise my own users just to make life difficult for spammers.

Alternatives

Mollom is a popular option for preventing spam on your Drupal site. It works by employing a learning algorithm to determine whether a registration attempt or comment posted is genuine.

Update

The Honeypot module provides the same form protection as Spamicide, but it also adds a timestamp-based deterrent that ensures a small amount of time has passed between page load and form submission. It is in use on drupal.org, so is clearly a solid option. (Thanks to @servercheck for pointing this out in the comments.)

Recreating my perfect web development environment using a Parallels VM and CentOS

As MacUpdate has just launched an excellent bundle deal that includes the virtualization software Parallels Desktop for Mac, I decided the time was ripe to replace my ageing copy of VMWare Fusion with its creaky installation of Ubuntu Server. Almost all of my web development work is done on my Macbook Pro, and my previous setup of running Ubuntu in a VM as my local web server has worked very well. As my external hosting is done on a VPS running CentOS, I thought it'd be a good idea to start running the popular OS locally as well. Parallels meanwhile has been garnering good reviews, and I was particularly drawn to the improved 3D performance over VMWare Fusion (who doesn't love a bit of gaming now and then), as well as the improved power management features, which should lead to longer battery life.

I had my VMWare Fusion/Ubuntu server setup configured to my particular tastes, and if the new setup was going to work out, Parallels would have to be able to replicate all of that functionality. In particular it would have to:

  1. Let me configure the guest OS (CentOS Linux) with a fixed IP address accessible from the host computer (Mac OS), and
  2. Let me keep my development files outside of the VM, i.e. on the host computer, so that I can edit the files in Mac OS, yet have them served by the VM.

Happy to report that everything went swimmingly, and I'm back up and running with my development work within a day. There was a fair bit of trial and error along the way, and I've recorded the steps here for your enjoyment. Hope it helps!

Prepare Parallels networking

The first thing we need to do is tweak Parallel's networking settings. What we need to do is leave a gap in the DHCP range that we can use to assign a static IP address to our VM.

  1. Open the Parallels preferences, and switch to the Advanced tab.
  2. Click Change settings next to Network.
  3. Under Shared, ensure IPv4 DHCP is enabled
  4. Configure the following values:
    • Start Address: 10.211.55.20
    • End Address: 10.211.55.254
    • Subnet mask: 255.255.255.0

Install CentOS

Next we download and install CentOS. I've chosen to download the CentOS 6 minimal distribution as I wanted to keep things compact, and install only the components I need. I've chosen to forego a GUI for example. The installation itself is pretty straightforward, but pay particular attention to the network configuration bit.

  1. Download a CentOS 6 64-bit minimal distribution. The filename should be something like CentOS-6.4-x86_64-minimal.iso, but that will depend on the exact version you're downloading.
  2. Open Parallels and select the downloaded ISO. The installation will run.
  3. Select Install or upgrade an existing system.
  4. Live dangerously and skip the media test. The CentOS installer will now launch.
  5. Click Next, the select your language keyboard preferences.
  6. Choose Basic Storage Devices.
  7. Choose Discard any data when configure the filesystem.
  8. Specify the domain name. I've named mine "firefly". You can use anything you like, but remember it as we'll be using this again later.
  9. Click Configure Network:
    1. Select System eth0 and click Edit.
    2. Tick Connect automatically and Available to all users.
    3. Under IPV4 settings select Manual from the Method drop down.
    4. Under Addresses, click Add and enter the following details:
      • Address: 10.211.55.2
      • Netmask: 255.255.255.0
      • Gateway: 10.211.55.20 (the first IP in the DHCP range)
    5. Under DNS servers, enter 10.211.55.20 again.
    6. Click Apply then close the Network dialogue.
  10. Click Next the select your timezone.
  11. Specify your root password. Click Next.
  12. Choose Use all space, then Next.
  13. Choose Write changes to disk, then wait for the file system to be configured.
  14. The installation will now start. When finished, choose Reboot.

Check and prepare CentOS installation

  1. Login as root with the password you chose during the installation.
  2. Check the network settings are correct by entering:
        ifconfig
        

    You should see a listing for eth0 with a confirmation of the IP address you specified during the installation (10.211.55.2).

  3. You should also get a response when you try to ping an outside address:

        ping www.google.com
        
  4. Upgrade all the installed packages:
    yum -y update
  5. Install the CentOS setuptool and packages for managing firewall, network, authentication and system service settings:
    yum -y install setuptool system-config-securitylevel-tui authconfig system-config-network-tui ntsysv
  6. Run setuptool:
    setup
    1. Now choose Firewall configuration and then Run tool.
    2. The firewall should be enabled.
    3. Choose Customize then tick the boxes for SSH and WWW to allow access from the host machine.
    4. Click Forward, Close, OK, then Yes.
  7. I experienced some problems configuring the Apache document root, as well as mounting shared folders in the virtual machine, and found that disabling selinux (Security Enforced linux) solved all of that. To disable it, open /etc/selinux/config for editing and change the Set SELINUX line to:
        Set SELINUX=disabled
        

    You'll now need to reboot the VM:

    reboot
  8. It's bad practice to be logged in as root all of time, so let's create a normal user account (replace my name with your own):
    adduser rob

    Set a password for the new account:

    passwd rob

    Just remain logged in as root for the duration of this tutorial.

  9. In the future when you're logged in as the new user, you can perform administrative tasks that require root access by prefixing commands with sudo. We need to tell the system that the new user is allowed to use sudo:

    Edit the sudo configuration file:

    vi /etc/sudoers

    Add the following line to the bottom of the file (again substitute my name for the one you specified when you added the:

    rob    ALL=(ALL)   ALL

Install Parallels Tools

The installation of Parallels Tools is required for certain important features to work, notably sharing folders between the guest and host.

  1. With the virtual machine selected, choose Virtual Machine from the Parallels menu, and click Install Parallels Tools…
  2. This is supposed to mount the the Parallels Tools ISO, but as that didn't work for me, I had to manually mount it:
    mount -o exec /dev/cdrom /media
  3. Once mounted you can run the Parallels Tools installer:
    /media/install
  4. Follow the onscreen prompts to complete the installation, ensuring that you answer yes to the prompt to download missing components.
  5. Reboot the VM when finished:
    reboot

Install and configure Apache

  1. Install Apache:
    yum -y install httpd
  2. Configure Apache to start up at boot time:
    chkconfig --level 23 httpd on
  3. Start Apache:
    service httpd start
  4. If you now enter the IP address of your guest (10.211.55.2, as we configured earlier) in a browser on the host computer, you should see the default Apache page.
  5. Let's make the Apache server respond to a friendly name instead of an IP address. We'll use the domain name you specified during the CentOS installation.

    Set the domain name you specified as the server's host name (I used "firefly"):

    hostname firefly

    Now edit Apache's configuration file:

    vi /etc/httpd/conf/httpd.conf

    Do a search for the line containing ServerName, and change it to:

    ServerName firefly:80

    Now add the host name to the server's hosts file. Edit the hosts file:

    vi /etc/hosts

    Add the following line to the bottom:

    10.211.55.2 firefly

    Finally, to access the server by this friendly name from the host computer, we need to update the host computer's hosts file as well:

  6.     10.211.55.2 firefly
        

Set Apache's document root to a folder on the host computer

I prefer keeping my actual development files outside the virtual machine. If the VM ever gets corrupted, I know my important files will still be safe and sound. This could also aid you in making backups, as your backup program wouldn't have to backup the entire virtual machine every time a file is changed. Nowadays, Parallels has good Time Machine integration, so it'll only back up incrementally (not so for VMWare Fusion at the time of writing), but I just feel much safer not having my files wrapped up in a container.

  1. With the virtual machine selected, choose Virtual Machine from the Parallels menu, and click Configure.
  2. Under the Options tab, select Sharing.
  3. Click Custom Folders.
  4. Add the folder that contains all your web files, with "Read & Write" permissions.
  5. You might have to reboot the VM at this point to pick up the new shared folder(s).
  6. Parallels mounts shared folders in /media/psf, so you should now be able see your shared folders there.
  7. By default, Apache's document root (the directory from which it serves files), is set to /var/www/html. We'll replace that folder with a symbolic link to the folder we just shared. First we delete the html folder:
        cd /var/www
        rm -rf html/
        

    Next we create the symlink. Replace Sites with the name of your own shared folder:

    ln -s /media/pst/Sites html

Install PHP

  1. Add the package:
    yum -y install php
  2. Now restart Apache:
    service httpd restart

Install and configure MySQL

  1. Add the packages:
    yum -y install mysql mysql-server
  2. Start the MySQL daemon:
    service mysqld start
  3. Configure the MySQL daemon to start up when the server boots:
    chkconfig --levels 235 mysqld on
  4. Set the root password (substitute <PASSWORD> with your own):
    mysqladmin -u root password 
  5. Optionally follow my guide for creating databases and additional users.

Optionally install and configure phpMyAdmin

  1. phpMyAdmin is not available in one of the default repositories, so we need to enable the EPEL (Extra Packages for Enterprise Linux) repository. Go here and download the listed RPM package to your VM.
  2. From the directory where the RPM is saved, enter:
        rpm -ivh 
        
  3. Now install phpMyAdmin:
        yum -y install phpmyadmin
        
  4. Edit the phpMyAdmin configuration file to allow access from the host computer:
        vi /etc/httpd/conf.d/phpMyAdmin.conf
        

    Locate all instances of Allow from 127.0.0.1 and change them to:

        Allow from All
        
  5. Restart Apache:
    service httpd restart
  6. You should now be able to access phpMyAdmin from the host computer at http://firefly/phpmyadmin (substituting "firefly" for whatever you've named yours).

Recursively list the most recently changed files on the command line

Here's a command line tip courtesy of snippets.dzone.com to recursively list the recently changed files starting in the current directory. Handy if your site ever gets hacked and you're not sure exactly where the perpetrator has managed to inject his code:

find . -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort

Mac OS X: Automatically connect to a network drive when your computer starts up or wakes from sleep

If you, like me, have dipped your toes into the world of network-attached storage (NAS), you've probably been enjoying the benefits of having your music and photos stored in a central location accessible from all your household Macs. Equally likely, you've run into the annoyance of starting an application such as iTunes, iPhoto or Picasa that relies on the network drive being available, only to discover that it, well, isn't.

Whereas it's fairly trivial to instruct your Mac to connect to a network drive on start up, there is no support in OS X for reconnecting to that drive when the computer wakes from sleep. Fear not, I have a solution for you that will ensure your network drive is always available. We'll be using Automator, shell scripts and the excellent SleepWatcher utility to make it work, so the tutorial assumes you have at least a passing familiarity with the command line.

At the time of writing, I'm running OS X Mountain Lion 10.8.2, but I've had this working on 10.6 and 10.7 as well. If you notice any differences or have any tips, please share them in the comments below.

Step 1: Create an Automator app that connects to your network drive

  1. Launch Automator.
  2. Choose Application from the new document dialogue.
  3. In the search field enter pause.
  4. Drag the pause action from the results to the right pane.
  5. Enter 10 seconds to slightly delay the connection attempt to ensure your network connection (wi-fi or otherwise) is ready.
  6. Do a search for server in the search field.
  7. Now drag Get Specified Servers and Connect to Servers (in that order) to the right pane.
  8. In the Get Specified Servers dialogue click Add... and select or enter the address to your network drive. Click OK.
  9. Disconnect from your network drive (eject it), and click Run in Automator to test it. Your network drive should connect after a 10 second delay.
  10. Now save your application. I created a folder called Scripts in my home directory and saved it in there as ConnectToNetworkDrive.

Step 2: Add the Automator script to your startup items to reconnect to the network drive when you restart

  1. In System Preferences select Users & Groups then your user account.
  2. Select the Login Items tab then drag your newly created Automator app onto the pane.

Step 3: Create a shell script that launches the Automator app

  1. Create a file on the command line or open your text editor and paste the following in there:
#!/bin/sh
echo 'open ~/Scripts/ConnectToNetworkDrive.app/' | /bin/sh\&
  1. If you've chosen a different name for your Automator app, you'll have to update the script to reflect that.
  2. Save this to your Scripts folder as ConnectOnWake.sh or something equally descriptive.
  3. We'll have to make that script executable so launch Terminal.app and navigate to your Scripts folder. Assuming you've created it in your home directory and named the script as I have, you would do this:
chmod u+x ~/Scripts/ConnectOnWake.sh

Step 4: Install SleepWatcher

  1. Download SleepWatcher from http://www.bernhard-baehr.de/.
  2. Follow the SleepWatcher installation instructions in the enclosed readme.rtf. If you've never installed SleepWatcher before, make sure you follow the instructions under the heading Installation for new SleepWatcher users.

Step 5: Use SleepWatcher to call our shell script when the computer wakes from sleep

  1. We'll need to create a SleepWatcher configuration file (.plist) that specifies our shell script. We'll use one of SleepWatcher's example files as a starting point. They're located in the config directory in the SleepWatcher download, so go there.
  2. Make a copy of de.bernhard-baehr.sleepwatcher-20compatibility.plist. I simply called mine WakeNetworkDrive.plist.
  3. Edit the new file (either on the command line or with Xcode if you have it installed), so that it matches the following:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>wakestora</string>
  <key>ProgramArguments</key>
  <array>
    <string>/usr/local/sbin/sleepwatcher</string>
    <string>-V</string>
    <string>-w ~/Scripts/ConnectOnWake.sh</string>
  </array>
  <key>RunAtLoad</key>
  <true/>
  <key>KeepAlive</key>
  <true/>
</dict>
</plist>
  1. Now copy the plist file to either /Library/LaunchDaemons (to activate it for all users) or ~/Library/LaunchAgents (to activate only for the current user). I chose the latter option.
  2. Start the SleepWatcher daemon so that it runs automatically the next time you turn on your computer:
sudo launchctl load ~/Library/LaunchAgents/WakeNetworkDrive.plist
  1. If you get an error message from the above command, change the ownership of your plist file and then try again:
sudo chown root:wheel ~/Library/LaunchAgents/WakeNetworkDrive.plist
  1. Now test! Restart your computer. Your network drive should automatically connect after 10 seconds. Eject the network drive then put your computer to sleep (close the lid, for example), wait a few seconds then wake the computer. Your network drive should connect within 10 seconds (you might see an animated gear icon in the menu bar while this is happening).

If you have any tips or suggestions, please share them in the comments.

Prevent flickering of CSS transitions and transforms in Webkit

I've run into the problem of flickering CSS transitions and transforms in Webkit browsers (Chrome and Safari) a few times, especially while developing Darkadia, which makes liberal use of these properties. Turns out the solution is quite simple and you can check it out here on Stack Overflow. I'll reproduce the fix here for posterity.

Add the following property to the element your transition is applied to:

#element {
  -webkit-backface-visibility: hidden;
}

That sorted out most of the flickering on Darkadia, but I was still experiencing flickering on a completely unrelated element. Adding the following property to the body tag sorted that out too:

body {
  -webkit-transform: translate3d(0, 0, 0);
}

Omnibar for Safari

I've recently been experimenting with using Safari as my primary browser after mounting disappointment with Chrome's performance on my Mac (random tab crashes and odd rendering issues had me scratching my head). Safari really is very quick and provides a more seamless and polished "Apple" experience, if that's your thing.

What I absolutely love about Chrome though is its Omnibar, the unified search and URL bar that figures out whether you've entered a URL or search term and performs the appropriate action. So I was delighted to find Oliver Poitrey's Safari Omnibar, a free SIMBL plugin that adds all the Omnibar goodness to Safari. It's early days yet as I've only just started using it, but it's standing up well in my testing and I haven't encountered any problems yet.

Now the only thing I'm missing is how the Delicious plugin in Chrome integrates its search results into the Omnibar, but this will do nicely for now.

Use jQuery to get the HTML of a container, including the container itself

Problem: You have the following markup, and you'd like to use jQuery to retrieve the contents of #container, but also include #container's markup in the returned HTML:

<div id="container">
  <div class="foo">bar
</div>

There are a number of ways to tackle the problem, but the most elegant solution I've come across is one posted as an answer to a question on Stack Overflow. This following snippet selects the container and wraps it in a <div> tag. It then immediately selects the wrapping tag with parent(), before assigning its contents to x.

var x = $('#container').wrap('').parent().html();

If you like, you can remove the wrapping <div> tag:

$('#container').unwrap();

Handling image uploads with custom validation using the Drupal 6 Forms API

I've recently had to go through the exercise again of building a bespoke form in Drupal 6 that accepts a file upload, validates it and saves it into to the server. Not doing this nearly enough to have it committed to memory means numerous trips to Google and frequent dives into the Drupal forms API are the order of the day. I've distilled my code into a working example that I hope will serve as future reference.

The sample code below assumes that you're comfortable creating a Drupal module, and have created a form using the forms API.

First we define a form with a single upload field and a submit button:

function module_image_form() {
  
  // Set the form's enctype to allow it to handle file uploads
  $form['#attributes']['enctype'] = "multipart/form-data";
    
  $form['logo'] = array(
    '#type' => 'file',
    '#title' => t('Logo'),
  );
  
  $form['submit'] = array(
    '#type' => 'submit',
    '#value' => t('Save'),
  );
  
  return $form;
  
}

Next we define the form's validate function which, after ensuring that the uploaded image meets our validation requirements, will handle the actual upload. Drupal provides built-in validation to check for a valid image, but we won't learn very much from using that. Instead, we specify our own validation function to prevent GIFs from being uploaded. We'll use Drupal's own mechanism for validating the image dimensions though.

function module_image_form_validate($form, &$form_state) {

  // Set this to the name of the form's file field
  $field = 'logo';

  // The directory the image will be saved to (file_directory_path() 
  // returns the path to your default files directory).
  $directory = file_directory_path() . '/images';

  // Drupal will attempt to resize the image if it is larger than 
  // the following maximum dimensions (width x height)
  $max_dimensions = '800x600';
  
  // We don't care about the minimum dimensions
  $min_dimensions = 0;

  // file_check_directory() ensures the destination directory is valid and 
  // will attempt to create it if it doesn't already exist.
  if (file_check_directory($directory, FILE_CREATE_DIRECTORY, $field)) {

    // Specify the validators. The first is Drupal's built-in function for validating
    // the image's dimensions. The second is our custom validator to exclude GIFs.
    $validators = array(
      'file_validate_image_resolution' => array($max_dimensions, $min_dimensions),
      'module_validate_image_type' => array(),
    );

    // Move the file to its final location if the validators pass, else
    // return to the form and display the errors.
    if ($file = file_save_upload($field, $validators, $directory)) {

      // Set the file's status to permanent, which will prevent Drupal's file 
      // garbage collection from deleting it.
      file_set_status($file, FILE_STATUS_PERMANENT);

      // We add our final file object to the form's storage array, so that it gets passed
      // through to the form's submit handler, where we can act on it.
      $form_state['storage']['file'] = $file;

    }
  }
}

This is our custom validator, which simply checks to see that the uploaded image does not have a MIME type of image/gif:

function module_validate_image_type($file) {
  $errors = array();
  $info = image_get_info($file->filepath);
  if ($info['mime_type'] == 'image/gif') {
    $errors[] = t('Only JPEG and PNG images are allowed.');
  }
  return $errors;
}

Finally, we implement the form's submit handler. The successfully uploaded image will be passed to it in the $form_state variable. I'll leave the submit handler blank, because what happens next depends entirely on what you're trying to achieve.

function module_sample_form_submit($form, &$form_state) {
  // print_r($form_state['storage']['file']) to view the uploaded file's details
}