Issue Details (XML | Word | Printable)

Key: MDL-17144
Type: Sub-task Sub-task
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Dongsheng Cai
Reporter: Martin Dougiamas
Votes: 5
Watchers: 12
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Moodle
MDL-17107

Implement spamcleaner.php tool to clean up damaged profiles

Created: 06/Nov/08 03:30 PM   Updated: 17/Feb/09 08:14 PM
Return to search
Component/s: Administration
Affects Version/s: 1.9
Fix Version/s: 1.8.9, 1.9.5

File Attachments: 1. File spamcleaner.php (11 kB)
2. Text File spamcleaner_sesskey.txt (3 kB)

Issue Links:
Relates
 

URL: http://docs.moodle.org/en/Reducing_spam_in_Moodle
Participants: A. T. Wyatt, Amr Hourani, Dongsheng Cai, Eloy Lafuente (stronk7), Helen Foster, Jeff Sherk, Martin Dougiamas, Martin Huntley, Ralf Hilgenstock and Tim Hunt
Security Level: None
QA Assignee: Tim Hunt
Resolved date: 12/Feb/09
Affected Branches: MOODLE_19_STABLE
Fixed Branches: MOODLE_18_STABLE, MOODLE_19_STABLE


 Description  « Hide
Implement a simple drop in script to help delete profiles that already have spam in them.

 All   Comments   Change History   Version Control      Sort Order: Ascending order - Click to sort in descending order
Martin Dougiamas added a comment - 06/Nov/08 03:38 PM - edited
This is now available:

http://cvs.moodle.org/contrib/tools/spamcleaner/spamcleaner.php?view=co

Your improvements are very welcome! Please improve the script if you have write access already, or post ideas and patches here in the tracker.

The main thing I think we could add next would be a built-in list of phrases to search for, or perhaps a way to process all the texts with an external spam-checking engine.


Dongsheng Cai added a comment - 06/Nov/08 05:10 PM
Akismet could help, a PHP lib is available too.


Eloy Lafuente (stronk7) added a comment - 06/Nov/08 06:09 PM
Add sesskey to protect it a bit.

(just guessing if we could start storing somewhere a table of big hashes detected as spam in order to provide one webservice with that. nothing "intelligent" for now, just hash-check).

I the future this could be evolved to anything more complex (own heuristic checker, gateway to "ask" other spam detectors...). But the ws should remain the same.

Ciao


Martin Dougiamas added a comment - 06/Nov/08 10:17 PM
Attached is a patch to add sesskey, though some YUI magic seems to be interfering when I try it. Dongsheng or Eloy, can you please look into it and update CVS?

A further feature idea would be to examine config settings at the start of this script and offer to fix forceloginforprofiles etc for the admin.


A. T. Wyatt added a comment - 07/Nov/08 02:11 PM
I do not exactly know how to attach related tracker issues, but here are some things that bear on this discussion:

Admin approval of email account creation/course enroll:
http://tracker.moodle.org/browse/MDL-9624

Completely remove user profiles (I can't see this one):
http://tracker.moodle.org/browse/MDL-17065

Preventing spam in user proviles:
http://tracker.moodle.org/browse/MDL-17107
(and of course this one http://tracker.moodle.org/browse/MDL-17144)


Helen Foster added a comment - 07/Nov/08 04:43 PM
Thanks atw.

(For future reference, you can add related issues by using the 'Link' link on the left of the page.)


Amr Hourani added a comment - 08/Nov/08 08:17 AM
Hi,
I changed the above script:

1- removed the rs_fetch_next_record($users) because it cause this error: Call to a member function FetchRow() on a non-object in C:\xampp\htdocs\vle\lib\dmllib.php on line 827

2- added image search, because some spammers add images such as Ads and p0rn0

3- enhanced sql queries, joining all conditions rather than making an query in loop

4- removing all redundunt queries and make them in one function to avoid any future mess if you want work on this script

5- showing one record for the user who meets more than 1 keyword.. the previous script was showing more than 1 record for any user who has more than 1 keyword in his profile.

hope this helps..

please upload it to CVS.

Cheers
Amr!


Dongsheng Cai added a comment - 10/Nov/08 11:40 AM
Thanks your improvement, Amr.
I modified your code to use get_recordset_sql, this will help to deal with large record sets.

Martin Dougiamas added a comment - 10/Nov/08 03:32 PM
I reviewed the code today and made quite a lot of cleanups. Please upgrade if you are using this script!

Amr Hourani added a comment - 10/Nov/08 08:01 PM
The new script doesnt include image search.

Dongsheng Cai added a comment - 10/Nov/08 09:26 PM
Hi, Amr, image searching is included in "Autodetect common spam patterns"

Amr Hourani added a comment - 10/Nov/08 09:34 PM
thanks Dongsheng, i noticed it

Ralf Hilgenstock added a comment - 10/Nov/08 10:24 PM
Error message in 1.9.2 2080711
require_js: yui_json - file not found.

and

The entry "<img" has as consequence that all emoticons in profiles are declared as spam.


Ralf Hilgenstock added a comment - 10/Nov/08 10:36 PM
'cialis' finds specialist as term in profile.

Jeff Sherk added a comment - 10/Nov/08 11:55 PM
The Bad Behavior script can prevent spam accounts from ever even being created in the first place... this is an extremely effective script!!

See
http://tracker.moodle.org/browse/MDL-17162

and
http://www.bad-behavior.ioerror.us/


Martin Dougiamas added a comment - 11/Nov/08 11:07 AM
Thanks Jeff, that's a very interesting development for spambot detection, I'll add it to the main bug MDL-17107.

We still need a good clean up script, because of old sites already affected as well as those affected by human spammers.

TODO:

  • Use format_text when printing the user->description!
  • We need to add blog searching (posts table) and show the same user info (just replace the description with the blog content).
  • Can someone come up with a better image search that avoids smilies? (Will require regular expressions I think which will slow down the search a lot).

Dongsheng Cai added a comment - 11/Nov/08 02:10 PM
TODOs are fixed, please review.

Ralf Hilgenstock added a comment - 11/Nov/08 09:34 PM
Hi Dongsheng

ver 1.12. finds smilies furthermore for me.


Dongsheng Cai added a comment - 05/Dec/08 10:13 AM
Hi, Ralf, what's your search keyword, is it "<img"?
I tested, and no smilies for me.

Tim Hunt added a comment - 22/Jan/09 09:44 AM
Putting back 1.8.x fix version that got lost.

Eloy Lafuente (stronk7) added a comment - 31/Jan/09 07:57 AM
Hi, can this be considered resolved?

Martin Huntley added a comment - 12/Feb/09 12:17 AM
I'm using Moodle 1.9.2+ (Build: 20080723).

When I go to run spamcleaner.php, I get error:

require_js: yui_json - file not found.


Martin Dougiamas added a comment - 12/Feb/09 10:16 AM - edited
Martin, are you updating via CVS? If so, make sure you use the create new directories option. On command-line CVS that is -d, for example:

cvs -q update -dP

You might need to upgrade to Moodle 1.9.4 anyway.


Tim Hunt added a comment - 17/Feb/09 02:27 PM
Closing. Spam-cleaner is implemented and works nicely.

Helen Foster added a comment - 17/Feb/09 08:14 PM
Dongsheng, thanks for fixing, and thanks for everyone's comments.

We now have a documentation page: http://docs.moodle.org/en/Spam_cleaner