[cgi-wiki-dev] Important development targets; spam defenses

Tim Sweetman ti at lemonia.org
Thu Apr 20 00:23:53 BST 2006



Earle Martin wrote:

>What would you nominate?
>I'll start the ball rolling: I think we need a banned-content plugin.
>Regular expressions (e.g. "cheap-viagra-levitra.com") could be added to a
> This list would be checked against edit text before the node is
>written. If a match is found, an "Edit Denied" error is shown. Optionally:
>the IP address is also added to a banned IPs list that are not permitted to
>edit. It would be fairly trivial to set up something that gets the big list
>of http://www.communitywiki.org/cw/BannedContent on a nightly basis.
>The need for this is clear, IMHO, given that OpenGuides sites are being
>subjected to a barrage of spam at the moment which comes from a range of
>different IPs every time - simple IP blocking will not defend us against
Banning IP addresses may constitute a wild goose chase, since 
blocking/monitoring them is the obvious approach, and spammers will 
probably get around it by all the obvious means (botnets, open proxies, etc)

>Other suggestions?
Use wiki as "ham" corpus, maintain "spam" corpus, use Bayesian analysis 
on keyword and other metrics to discriminate between spam & nonspam. 
(It's the only way to be sure).



