WordPress spammers finally get smart and simulate a real human

Today I saw the first attempt by malware to write a spam comment on my WordPress blog that was not a blatant attack. Today’s attempt was a perfect emulation of a human registering an account at my blog in order to post a comment (note that I don’t require registration to post a comment). That emulation included appropriate delays between the relevant HTTP requests and spanned four minutes and eleven seconds. In fact, the attack looks like it was conducted by a real human whose activity was proxied by a malware infected server.

The attack came from 198.15.148.67 which is owned by serveryou.com, a virtual private server provider (i.e., hosting service). The comment was quarantined by Akismet as spam. When I reviewed the comment it was obviously an attempt to drive traffic to another site via SEO manipulation and was completely unrelated to the article the comment was attached to.

So far I’ve seen only two things in the attack that might be used to detect this type of attack:

The requests specified this HTTP header:

Accept-Language: zh-CN,zh;q=0.8

My blog is entirely in english. This header strongly suggests the native language of the author of the malware is a Chinese dialect.

The request sequence included several GET requests of /wp-admin/* paths culminating in a POST /wp-admin/admin-ajax.php. The POST data was

interval=60&_nonce=4ba711a06f&action=heartbeat&screen_id=profile&has_focus=true

Prior to this attack my logs show that the action=heartbeat POST to admin-ajax.php has never been seen other than by legitimate requests from myself while working on my blog.

This is the rule I’ve added to flag this attack:

# Beginning 2015-07-26 sophisticated malware that spams WordPress
# comments by doing a very good impersonation of a real user hit my
# site. One obvious mistake the spammer made was in specifying only
# Chinese as an acceptable language.
#
# Note that the Baidu crawler specifies "zh-cn" for this header so
# it's important that the log monitoring program include heuristics to
# avoid blocking acceptable crawlers just because they trip this rule.
#
# Obviously this is a trivially bypassed rule but, again, it's meant
# to quickly reject requests from malware written by idiots.
RewriteCond %{REQUEST_URI} =/wp-login.php [NC,OR]
RewriteCond %{REQUEST_URI} =/wp-admin/admin-ajax.php [NC]
RewriteCond %{HTTP:Accept-Language} !=""
RewriteCond %{HTTP:Accept-Language} !en
RewriteRule ^ /blocked.php [END,E=error-notes:no-supported-language]

My Apache log monitoring program will then blacklist the source when it sees the 400 status returned by /blocked.php. I’ll probably also add a heuristic to my log monitoring program that detects admin-ajax.php and wp-login.php POST requests from “cloud” sources. This, obviously, will only be as good as my ability to identify cloud, virtual private server, hosting providers. Ideas from readers of this article are welcome.