Skip to main content

How to Minimize Spam and Plagiarism in WordPress

Feb 25, 2021

Spam is a ubiquitous part of the internet experience. You can go just about anywhere on the world wide web and encounter junk content and phishing links from bots, trolls, and spammers. It’s impossible to completely eliminate spam, but there are strategies you can use to minimize it on your website.

When we say “content,” in this case we’re referring to any content that you publish on your website, either directly or through a third-party application. This can take the form of blog posts, articles, videos, product reviews, and other types of visual and textual information.

The “spam” we’ll refer to in this article is specifically comment spam — that is, junk comments, left on your website content by bots and sometimes human trolls or spammers, often containing irrelevant or malicious links.

Why is Spam so Bad?

A lot of people think of spam as just an annoyance. There is some spam that’s relatively benign — like the occasional nonsensical comment from an AI, or an off-colour remark from a troll trying to stir up an argument. But a lot of spam poses a genuine security risk to your website, so it’s important to know what to watch out for.

Some of the most dangerous spam may look relatively harmless at a glance. Comment spammers will often post short, generic messages (“great blog post!”) as a cover for including links. These links can serve a few different purposes. In some cases, they may go directly to malicious phishing sites where your users may be tricked into giving away personal information, or be targeted by malware.

Do You Really Need Comments Enabled?

Before you even think about strategies for reducing spam comments, consider whether it’s actually serving you to have comments enabled on your website content at all.

We usually advise against it, except for in specific cases like product pages where you want customers to be able to leave reviews. These days, website comment sections have largely fallen out of favour when it comes to creating an engaging discussion around an article, post, or video. Most companies find more success sharing content on social media. Social media platforms are structured in a way that’s generally much more conducive to interacting with your customers or encouraging customers to spread the word about your products or services.

If you do want to keep comments enabled on your website, the following strategies can help you mitigate spam and maintain safety and security for your users.

Firewalls

The first line of defense for your website should be a firewall. A well-configured firewall will usually do a pretty good job of filtering out the worst offenders, like spammers, bots, and automated content scrapers.

A strong firewall can be updated and reconfigured over time to continue battling the worst malicious traffic. Settings like IP address and location filtering can nearly eliminate spam from noted frequent offenders.

Spam Filtering

Human spammers and sophisticated bots might still make it through your firewall, so the next line of defense may be a spam filter. There are many different spam-filtering plugins available through WordPress — make sure you evaluate the level of support and popularity of a plugin before you install it. WordZite uses Akismet, which is a very popular and effective spam filtering plugin.

Spam filters like Akismet work by analyzing comments found across websites where the plugin is installed, using that data to create evolving definitions of what constitutes spam. These plugins then cross-reference every comment that comes through your submission forms against those definitions to determine if the comment is legitimate.

Moderation and Comment Restrictions

Spam filters and firewalls are always evolving to stay one step ahead, but bots and spammers can be pretty smart — smart enough to occasionally get past even the best spam filters. To really minimize the spam that appears in your comments sections, you may need a human touch.

Applying moderation settings to your comments sections and forms gives you the ability to review responses and submissions before they’re published and visible on your website.

Comment moderation allows you to review comments when they are submitted, before they’re made public. Enabling comment moderation thus allows you to manually delete any comments that you deem suspicious or inappropriate. Of course, if you tend to get a lot of interactions on posts, moderating all those comments can quickly become a part-time job. If you plan to do this manually, make sure you have someone who is prepared to take on the task.

If you don’t have the time to manually moderate all your comments, you can also implement a blacklist for certain words and links. A quick Google search will find you plenty of examples of pre-made blacklists that you can use as a starting point. It’s always a good idea to tweak the list to suit your own website’s topics and content, lest you end up in a situation where your overzealous blacklist settings start censoring common industry-specific terms.

Another strategy to supplement direct moderation is to restrict commenting to registered users. User registration, along with a bot-filtering mechanism like reCAPTCHA (the program responsible for the “I’m not a robot” check box that you’ll often find at the end of online submission forms), can significantly cut down on the amount of spam you receive. ReCAPTCHA will take care of all but the most sophisticated programs, and human spammers are likely to move on to an easier target rather than go through the trouble of creating an account just to comment.

Email Address Obfuscation

Bots sometimes “harvest” email addresses from your website code, and use those email addresses under false pretenses to broadcast further spam. Having your email address, or the address of a commenter, harvested and used by a spammer can hurt your site’s rankings and reputation.

Email address obfuscation is another strategy employed by web security experts to prevent this kind of harvesting and protect the addresses of you and your users. With email address obfuscation, a CDN or firewall like Cloudflare will scramble any email addresses that appear in the code of your website. If a bot tries to scrape the email address, it will come up with a bunch of unusable nonsense. For humans, the email address still appears and behaves normally on the front end of the website.

Email address obfuscation not only helps you cut down on spam, but it can also reduce the spread of further spam, and help you stay off blacklists.

Hotlink Protection

Hotlinking is a form of content scraping that lets spammers and hackers “steal” images from your website. A spammer can embed a link to an image hosted on your website within their own website’s code. On the front end, the image will appear just as it does on your site — the problem is that that image is still hosted on your server.

Finding the perfect images for your website can be costly and time consuming. You may have to hire a photographer or pay for quality stock photos. This is not to mention the huge amount of processing power it takes for servers to display high quality images on websites. With hotlinking, a hacker is using your image without your knowledge, and to add insult to injury, they’re using your server resources to do it. Hotlinking can slow down your website, or even get your website blacklisted. In extreme cases, hackers can use mass hotlinking to overload a server and cause multiple websites to crash.

Luckily, there are WordPress plugins designed specifically to prevent images from being hotlinked. Cloudflare and other CDNs also offer strong hotlinking protections. A few simple back-end configurations can significantly cut down on this practice, protecting your images and your resources.

Content Scraping Protection

Similar to hotlinking, content scraping is stealing your content (whether it’s plagiarizing sections of an article or entire pages) and posting it onto another site. Aside from intellectual property theft, content scraping is bad because it can often be a blow to your website’s reputation on search engines, and at worst it can get your site blacklisted when your website contains much of the same content as a known phishing website.

Content scraping is a challenge to minimize, especially when you’re just dealing with text. We usually disable text highlighting and right-click saving, but there’s no way we can stop someone manually copying text that appears on a website. The best way to combat the detrimental effects of content scraping is vigilant security monitoring along with some well written cease and desist letters from a lawyer familiar with digital rights management.

Blacklist Checking

While it can be helpful to use blacklists to assist with comment moderation, ending up on a blacklist yourself is something you want to avoid.

A number of web security companies maintain lists of IP addresses and domains that are associated with spam, phishing, and other malicious internet activities. Google’s Transparency Report indicates whether a site might be blacklisted or otherwise iffy. Unfortunately, sometimes innocent websites can end up on blacklists due to:

  • Spam comments from blacklisted IP addresses
  • Spam comments containing links to blacklisted domains
  • Scraped content reposted to a blacklisted domain
  • Images hotlinked on a blacklisted domain

If your website gets blacklisted, browsers will recognize it as a security risk, and users trying to reach your site will be presented with a security warning. This is often enough to drive savvy internet users away from your website — which is the last thing you want.

We regularly check the sites that we’re monitoring to see if they appear on any content blacklists. We have occasionally had to contact a security monitoring company to get a site removed from a blacklist. Luckily, once your site is removed from a blacklist, you can get back to business as usual pretty quickly.

Summary

It’s impossible to eliminate 100 percent of spam, but employing a combination of the above strategies on your website will go a long way toward minimizing it. And minimizing spam will keep your website in Google’s good books, as well as ensuring that your users are safe from phishing and malware attacks.

Of course, an ounce of prevention is worth a pound of cure — unless you have a business model where website comments are useful or necessary, consider disabling comments and moving conversations about your content over to social media. Maintaining an active and engaging social media presence is often a better way to connect directly with your customers.

preprovoked