WikiSpam is a wikiwide problem. It won't be solved but wikiwide. (see http://c2.com/cgi/wiki?WikiSpam)
I'll be using this page to brainstorm on tactics and techniques to use to combat wiki spam. As you may or may not have noticed, all forms of
editing for non-logged-in users has been disabled here on Aneuch Wiki (including the use of discussion pages). This was due to being targeted
by various spam bots.
Once I figure out the spam problem, I can re-enable (limited) forms of editing. ~AaronGraves
I'm looking at the possibility of using TextCha for combating wiki spam. ~AaronGraves
In addition to the TextCha, I'd like to implement a banned content feature, similar to something that Oddmuse uses.
Have a look at http://nedbatchelder.com/text/stopbots.html, it's got some interesting tips/ideas to use that I will likely implement into Aneuch 0.30.
07/24/2013 After reading the above linked page and a few comments, I think I've come up with a very good solution. I'll use a honeypot of sorts on all forms in a hope to catch spam bots. This is step one. We'll see how effective that becomes, and go from there.
Proposed Solutions
Here are some solutions I'm proposing for Aneuch. Bold are definites, bold italic I'm still on the fence about.
- Honeypot
All forms on the site (except the search form) should have a "honeypot" - that is to say, a separate form that is invisible to regular users, but any spam bot would see it. If there is data submitted in this form (or modified in a field where data already exists), the edit will be rejected as spam.
- Non-visible fields
All legitimate forms should have one or two non-visible fields, which may or may not contain data. If those fields are changed, the edit will be rejected as spam.
- Link counting
Count all of the links in a page before edit, and count the number of links after the edit. If more than a certain number have been added, reject the edit as spam. (Note that this should probably only apply to external links)
- Timestamping
Introduce an encrypted time stamp into the form, and if it's been too short or too long between the time the form was loaded and submitted, reject the edit as spam.
- Content filtering
Implement a content filtering system, whereby a wiki admin can load regular expressions into the filter which, if matched on an edit, the edit is rejected as spam. MoinMoin has a good BadContent file.
- TextCha
Implement a text version of the popular captcha, where a user must successfully answer a challenge, and if not correct, the edit is rejected as spam.
- Banning
IP based banning is already implemented into Aneuch, however it works on a single IP only. You can't ban ranges, etc. So in the next version of Aneuch, this should be changed to a regex format instead, so that ranges and so forth can be set. (see http://www.oddmuse.org/cgi-bin/oddmuse/Banning for more info) Also, in the future a new variable will be introduced that will affect the visibility of the site to those affected by an IP ban. As of now, if you're banned you're given a 403 error when attempting to view the site. Perhaps we can implement some feature whereby a ban is only against editing, but make it configurable so that you can also ban people from viewing all together.
- rel=nofollow on external links
This is already implemented. All external links in an Aneuch wiki are automatically tagged with the "rel=nofollow" attribute. This means that search engines should not follow the link, thereby defeating the purpose of link spam.