Bureaucrats, checkuser, Interface administrators, interwiki, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), staff, Suppressors, Administrators
83,693
edits
Hoof Hearted (talk | contribs) (→What happens if spam slips through the automated systems: improve section) |
Hoof Hearted (talk | contribs) m (fix double redirects, grammar) |
||
| Line 1: | Line 1: | ||
{{TOCright}} | {{TOCright}} | ||
For some thoughts about spambot hunting see [[WikiProject:Junking bots]] <small>...didn't know where to lonk it --[[Wolf Peuker|Wolf]] | <small>[[User talk:Peu|talk]]</small> 07:05, 13 October 2007 (EDT)</small> | For some thoughts about [[spambot]] hunting see [[WikiProject:Junking bots]] <small>...didn't know where to lonk it --[[Wolf Peuker|Wolf]] | <small>[[User talk:Peu|talk]]</small> 07:05, 13 October 2007 (EDT)</small> | ||
==Proposed | ==Proposed spam control policy== | ||
Comments and | Comments and corrections welcome. | ||
There are three levels of spam control now in play, and this policy will address how each of them should be used. | There are three levels of {{tag|spam}} control now in play, and this policy will as {{tag|guidelines}} to address how each of them should be used. | ||
===Level 1 - LocalSettings.php=== | ===Level 1 - LocalSettings.php=== | ||
There is a regex filter in the LocalSettings.php file that is under the control of the [[WikiIndex:Bureaucrats|site bureaucrats]]. This level blocks specific words, phrases, and html fragments – that are commonly used by link | There is a regex filter in the LocalSettings.php file that is under the control of the [[WikiIndex:Bureaucrats|site bureaucrats]]. This level blocks specific words, phrases, and html fragments – that are commonly used by link [[spammer]]s and [[vandal]]s. It contains common curse words, sex acts and symbols, body parts, most of the major drug names, and html fragments that are used to hide and/or mask link spam and graffiti. | ||
These are the common denominators of 90% of spam and graffiti. The regex will match any of these items and block the save of a page that has any one of these items anywhere on it. | These are the common denominators of 90% of spam and graffiti. The regex will match any of these items and block the save of a page that has any one of these items anywhere on it. | ||
It is NOT necessary to block anything containing these words anywhere else. | It is NOT necessary to block anything containing these words anywhere else. | ||
| Line 18: | Line 18: | ||
If you find a particular link is not being caught by Levels 1 or 2 you can: | If you find a particular link is not being caught by Levels 1 or 2 you can: | ||
#If it is link spam or graffiti that contains a word that should be in the level 1 list, submit it to the [[:Category:Active administrators of this wiki|site administrators]] via a new message on | #If it is link spam or graffiti that contains a word that should be in the level 1 list, submit it to the [[:Category:Active administrators of this wiki|site administrators]] via a new message on their [[talk page]]; | ||
#If it is a link that you think should be banned at all | #If it is a link that you think should be banned at all [[wiki]]s, submit it to '''[[MetaWiki:Talk:Spam blacklist|Wikimedia Meta-Wiki]]'''; | ||
#If it is a link that you think should be banned from just WikiIndex, go to Level 3 | #If it is a link that you think should be banned from just [[WikiIndex]], go to Level 3. | ||
===Level 3 - Local | ===Level 3 - Local blacklist=== | ||
We maintain a local blacklist at '''[[My spam blacklist]]'''. This is protected page that [[:Category:Active administrators of this wiki|Sysops]] can use to block offending link spam not caught by Level 1 and Level 2. There should be very few entries here, and NONE that contain the following: | We maintain a local blacklist at '''[[My spam blacklist]]'''. This is protected page that [[:Category:Active administrators of this wiki|Sysops]] can use to block offending link spam not caught by Level 1 and Level 2. There should be very few entries here, and NONE that contain the following: | ||
*Periods "." - | *Periods "." - periods, aka the 'full stop' have a special meaning in the regex syntax, and can cause the list to malfunction; | ||
*Tlds "com, org, net" - | *Tlds "com, org, net" - these appear in all URLs, so provide no value to the blocking mechanism; | ||
*"http://www." - | *"http://www." - the regex only checks valid URLs, so this is not necessary. | ||
An example: | An example: | ||
If you want to block linking to http://www.mybadwordsite.com you should only enter | If you want to block linking to http://www.mybadwordsite.com you should only enter 'mybadwordsite' | ||
If Level 1 or Level 2 already contain the | If Level 1 or Level 2 already contain the 'bad word' – then the link would be blocked already, and no entry would be necessary, and you would not be able to save the list. | ||
===Level 3.5 - CAPTCHA=== | ===Level 3.5 - CAPTCHA=== | ||
Now implemented. By default, CAPTCHAs are triggered on the following events: | Now implemented. By default, CAPTCHAs are triggered on the following events: | ||
*New user registration | *New user registration; | ||
*Anonymous edits that contain new external links | *Anonymous, or [[IP editor]] edits that contain new external links; | ||
*Brute-force password cracking | *Brute-force password cracking. | ||
===Level 4 - Login to | ===Level 4 - Login to edit=== | ||
Not implemented. | Not implemented. | ||
There are two options at this level: | There are two options at this level: | ||
#Require a login to edit, and '''request''' an | #Require a login to edit, and '''request''' an e-mail confirmation; | ||
#Require a login to edit, and '''require''' an | #Require a login to edit, and '''require''' an e-mail confirmation before editing is allowed. | ||
==Guidance for Spam Fighters== | ==Guidance for Spam Fighters== | ||
[[Ward Cunningham]] gave me this advice for spam fighting and keeping your sanity, "do the absolute minimum required to block each attack and the spammer will grow tired and leave" (I'm paraphrasing). This is so true, because you can drive yourself crazy trying to think of a way to defeat all attacks in advance of their actually happening! | [[Ward Cunningham]] gave me this advice for {{tag|SpamFighting|spam fighting}} and keeping your sanity, "do the absolute minimum required to block each attack and the spammer will grow tired and leave" (I'm paraphrasing). This is so true, because you can drive yourself crazy trying to think of a way to defeat all attacks in advance of their actually happening! | ||
==Spam Blacklist Regex== | ==Spam Blacklist Regex== | ||
| Line 54: | Line 54: | ||
In simple terms: | In simple terms: | ||
*Everything from a "#" character to the end of the line is a comment | *Everything from a "#" character to the end of the line is a comment; | ||
*Every non-blank line is a regex fragment which will '''only match inside URLs''' | *Every non-blank line is a regex fragment which will '''only match inside URLs'''. | ||
Internally, a regex is formed which looks like this: | Internally, a regex is formed which looks like this: | ||
| Line 61: | Line 61: | ||
!http://[a-z0-9\-.]*(line 1|line 2|line 3|....)!Si | !http://[a-z0-9\-.]*(line 1|line 2|line 3|....)!Si | ||
</pre> | </pre> | ||
A few notes about this format. It | A few notes about this format. It is not necessary to add www to the start of hostnames, the regex is designed to match any subdomain. Do not add patterns to your file which may run off the end of the URL, e.g. anything containing ".*". Unlike in some similar systems, the line-end metacharacter "$" will not assert the end of the hostname, it'll assert the end of the page. | ||
==What happens if spam slips through the automated systems== | ==What happens if spam slips through the automated systems== | ||
Please delete any spam that slips through the automated systems, and add the <code>{{template|spammer}}</code> tag on the | Please delete any [[spam]] that slips through the automated systems, and add the <code>{{template|spammer}}</code> tag on the [[spammer]]s [[user page]]. If a new page has been created with purely spam, edit the page by deleting said spam, and highlight the page for deletion by adding the <code>{{template|delete}}</code> tag, ideally by using <tt><nowiki>{{delete|spam}}</nowiki></tt>. If you are a [[:Category:Active administrators of this wiki|Sysop]], please block the spammer in accordance with the [[WikiIndex:Blocking and banning policy]]. | ||
[[Category:Guidelines]] | [[Category:Guidelines]] | ||
edits