Moodle

Implement spamcleaner.php tool to clean up damaged profiles

Details

  • Type: Sub-task Sub-task
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9
  • Fix Version/s: 1.8.9, 1.9.5
  • Component/s: Administration
  • Labels:
    None

Description

Implement a simple drop in script to help delete profiles that already have spam in them.

  1. spam.php
    17/Dec/09 4:46 AM
    0.2 kB
    Amr Hourani
  2. spamcleaner_sesskey.txt
    06/Nov/08 10:17 PM
    3 kB
    Martin Dougiamas
  3. spamcleaner.php
    17/Dec/09 4:46 AM
    13 kB
    Amr Hourani
  4. spamcleaner.php
    08/Nov/08 8:17 AM
    11 kB
    Amr Hourani

Issue Links

Activity

Hide
Martin Dougiamas added a comment - - edited

This is now available:

http://cvs.moodle.org/contrib/tools/spamcleaner/spamcleaner.php?view=co

Your improvements are very welcome! Please improve the script if you have write access already, or post ideas and patches here in the tracker.

The main thing I think we could add next would be a built-in list of phrases to search for, or perhaps a way to process all the texts with an external spam-checking engine.

Show
Martin Dougiamas added a comment - - edited This is now available: http://cvs.moodle.org/contrib/tools/spamcleaner/spamcleaner.php?view=co Your improvements are very welcome! Please improve the script if you have write access already, or post ideas and patches here in the tracker. The main thing I think we could add next would be a built-in list of phrases to search for, or perhaps a way to process all the texts with an external spam-checking engine.
Hide
Dongsheng Cai added a comment -

Akismet could help, a PHP lib is available too.

Show
Dongsheng Cai added a comment - Akismet could help, a PHP lib is available too.
Hide
Eloy Lafuente (stronk7) added a comment -

Add sesskey to protect it a bit.

(just guessing if we could start storing somewhere a table of big hashes detected as spam in order to provide one webservice with that. nothing "intelligent" for now, just hash-check).

I the future this could be evolved to anything more complex (own heuristic checker, gateway to "ask" other spam detectors...). But the ws should remain the same.

Ciao

Show
Eloy Lafuente (stronk7) added a comment - Add sesskey to protect it a bit. (just guessing if we could start storing somewhere a table of big hashes detected as spam in order to provide one webservice with that. nothing "intelligent" for now, just hash-check). I the future this could be evolved to anything more complex (own heuristic checker, gateway to "ask" other spam detectors...). But the ws should remain the same. Ciao
Hide
Martin Dougiamas added a comment -

Attached is a patch to add sesskey, though some YUI magic seems to be interfering when I try it. Dongsheng or Eloy, can you please look into it and update CVS?

A further feature idea would be to examine config settings at the start of this script and offer to fix forceloginforprofiles etc for the admin.

Show
Martin Dougiamas added a comment - Attached is a patch to add sesskey, though some YUI magic seems to be interfering when I try it. Dongsheng or Eloy, can you please look into it and update CVS? A further feature idea would be to examine config settings at the start of this script and offer to fix forceloginforprofiles etc for the admin.
Hide
A. T. Wyatt added a comment -

I do not exactly know how to attach related tracker issues, but here are some things that bear on this discussion:

Admin approval of email account creation/course enroll:
http://tracker.moodle.org/browse/MDL-9624

Completely remove user profiles (I can't see this one):
http://tracker.moodle.org/browse/MDL-17065

Preventing spam in user proviles:
http://tracker.moodle.org/browse/MDL-17107
(and of course this one http://tracker.moodle.org/browse/MDL-17144)

Show
A. T. Wyatt added a comment - I do not exactly know how to attach related tracker issues, but here are some things that bear on this discussion: Admin approval of email account creation/course enroll: http://tracker.moodle.org/browse/MDL-9624 Completely remove user profiles (I can't see this one): http://tracker.moodle.org/browse/MDL-17065 Preventing spam in user proviles: http://tracker.moodle.org/browse/MDL-17107 (and of course this one http://tracker.moodle.org/browse/MDL-17144)
Hide
Helen Foster added a comment -

Thanks atw.

(For future reference, you can add related issues by using the 'Link' link on the left of the page.)

Show
Helen Foster added a comment - Thanks atw. (For future reference, you can add related issues by using the 'Link' link on the left of the page.)
Hide
Amr Hourani added a comment -

Hi,
I changed the above script:

1- removed the rs_fetch_next_record($users) because it cause this error: Call to a member function FetchRow() on a non-object in C:\xampp\htdocs\vle\lib\dmllib.php on line 827

2- added image search, because some spammers add images such as Ads and p0rn0

3- enhanced sql queries, joining all conditions rather than making an query in loop

4- removing all redundunt queries and make them in one function to avoid any future mess if you want work on this script

5- showing one record for the user who meets more than 1 keyword.. the previous script was showing more than 1 record for any user who has more than 1 keyword in his profile.

hope this helps..

please upload it to CVS.

Cheers
Amr!

Show
Amr Hourani added a comment - Hi, I changed the above script: 1- removed the rs_fetch_next_record($users) because it cause this error: Call to a member function FetchRow() on a non-object in C:\xampp\htdocs\vle\lib\dmllib.php on line 827 2- added image search, because some spammers add images such as Ads and p0rn0 3- enhanced sql queries, joining all conditions rather than making an query in loop 4- removing all redundunt queries and make them in one function to avoid any future mess if you want work on this script 5- showing one record for the user who meets more than 1 keyword.. the previous script was showing more than 1 record for any user who has more than 1 keyword in his profile. hope this helps.. please upload it to CVS. Cheers Amr!
Hide
Dongsheng Cai added a comment -

Thanks your improvement, Amr.
I modified your code to use get_recordset_sql, this will help to deal with large record sets.

Show
Dongsheng Cai added a comment - Thanks your improvement, Amr. I modified your code to use get_recordset_sql, this will help to deal with large record sets.
Hide
Martin Dougiamas added a comment -

I reviewed the code today and made quite a lot of cleanups. Please upgrade if you are using this script!

Show
Martin Dougiamas added a comment - I reviewed the code today and made quite a lot of cleanups. Please upgrade if you are using this script!
Hide
Amr Hourani added a comment -

The new script doesnt include image search.

Show
Amr Hourani added a comment - The new script doesnt include image search.
Hide
Dongsheng Cai added a comment -

Hi, Amr, image searching is included in "Autodetect common spam patterns"

Show
Dongsheng Cai added a comment - Hi, Amr, image searching is included in "Autodetect common spam patterns"
Hide
Amr Hourani added a comment -

thanks Dongsheng, i noticed it

Show
Amr Hourani added a comment - thanks Dongsheng, i noticed it
Hide
Ralf Hilgenstock added a comment -

Error message in 1.9.2 2080711
require_js: yui_json - file not found.

and

The entry "<img" has as consequence that all emoticons in profiles are declared as spam.

Show
Ralf Hilgenstock added a comment - Error message in 1.9.2 2080711 require_js: yui_json - file not found. and The entry "<img" has as consequence that all emoticons in profiles are declared as spam.
Hide
Ralf Hilgenstock added a comment -

'cialis' finds specialist as term in profile.

Show
Ralf Hilgenstock added a comment - 'cialis' finds specialist as term in profile.
Hide
Jeff Sherk added a comment -

The Bad Behavior script can prevent spam accounts from ever even being created in the first place... this is an extremely effective script!!

See
http://tracker.moodle.org/browse/MDL-17162

and
http://www.bad-behavior.ioerror.us/

Show
Jeff Sherk added a comment - The Bad Behavior script can prevent spam accounts from ever even being created in the first place... this is an extremely effective script!! See http://tracker.moodle.org/browse/MDL-17162 and http://www.bad-behavior.ioerror.us/
Hide
Martin Dougiamas added a comment -

Thanks Jeff, that's a very interesting development for spambot detection, I'll add it to the main bug MDL-17107.

We still need a good clean up script, because of old sites already affected as well as those affected by human spammers.

TODO:

  • Use format_text when printing the user->description!
  • We need to add blog searching (posts table) and show the same user info (just replace the description with the blog content).
  • Can someone come up with a better image search that avoids smilies? (Will require regular expressions I think which will slow down the search a lot).
Show
Martin Dougiamas added a comment - Thanks Jeff, that's a very interesting development for spambot detection, I'll add it to the main bug MDL-17107. We still need a good clean up script, because of old sites already affected as well as those affected by human spammers. TODO:
  • Use format_text when printing the user->description!
  • We need to add blog searching (posts table) and show the same user info (just replace the description with the blog content).
  • Can someone come up with a better image search that avoids smilies? (Will require regular expressions I think which will slow down the search a lot).
Hide
Dongsheng Cai added a comment -

TODOs are fixed, please review.

Show
Dongsheng Cai added a comment - TODOs are fixed, please review.
Hide
Ralf Hilgenstock added a comment -

Hi Dongsheng

ver 1.12. finds smilies furthermore for me.

Show
Ralf Hilgenstock added a comment - Hi Dongsheng ver 1.12. finds smilies furthermore for me.
Hide
Dongsheng Cai added a comment -

Hi, Ralf, what's your search keyword, is it "<img"?
I tested, and no smilies for me.

Show
Dongsheng Cai added a comment - Hi, Ralf, what's your search keyword, is it "<img"? I tested, and no smilies for me.
Hide
Tim Hunt added a comment -

Putting back 1.8.x fix version that got lost.

Show
Tim Hunt added a comment - Putting back 1.8.x fix version that got lost.
Hide
Eloy Lafuente (stronk7) added a comment -

Hi, can this be considered resolved?

Show
Eloy Lafuente (stronk7) added a comment - Hi, can this be considered resolved?
Hide
Martin Huntley added a comment -

I'm using Moodle 1.9.2+ (Build: 20080723).

When I go to run spamcleaner.php, I get error:

require_js: yui_json - file not found.

Show
Martin Huntley added a comment - I'm using Moodle 1.9.2+ (Build: 20080723). When I go to run spamcleaner.php, I get error: require_js: yui_json - file not found.
Hide
Martin Dougiamas added a comment - - edited

Martin, are you updating via CVS? If so, make sure you use the create new directories option. On command-line CVS that is -d, for example:

cvs -q update -dP

You might need to upgrade to Moodle 1.9.4 anyway.

Show
Martin Dougiamas added a comment - - edited Martin, are you updating via CVS? If so, make sure you use the create new directories option. On command-line CVS that is -d, for example: cvs -q update -dP You might need to upgrade to Moodle 1.9.4 anyway.
Hide
Tim Hunt added a comment -

Closing. Spam-cleaner is implemented and works nicely.

Show
Tim Hunt added a comment - Closing. Spam-cleaner is implemented and works nicely.
Hide
Helen Foster added a comment -

Dongsheng, thanks for fixing, and thanks for everyone's comments.

We now have a documentation page: http://docs.moodle.org/en/Spam_cleaner

Show
Helen Foster added a comment - Dongsheng, thanks for fixing, and thanks for everyone's comments. We now have a documentation page: http://docs.moodle.org/en/Spam_cleaner
Hide
Amr Hourani added a comment -

keywords separated for non english auto terms

Show
Amr Hourani added a comment - keywords separated for non english auto terms
Hide
Martin Dougiamas added a comment - - edited

Thanks Amr, but would it be possible to post your changes as a patch rather than a whole file? Makes it easier to see what you changed.

Ideally it should be in a new bug too, since this one is closed.

Show
Martin Dougiamas added a comment - - edited Thanks Amr, but would it be possible to post your changes as a patch rather than a whole file? Makes it easier to see what you changed. Ideally it should be in a new bug too, since this one is closed.
Hide
Amr Hourani added a comment -

file diff and new lang file are here.

Show
Amr Hourani added a comment - file diff and new lang file are here.

Dates

  • Created:
    Updated:
    Resolved: