Moodle

auth/ldap sync temp table misconfigured - Slow or Failed Syncronization

Details

Description

In the creation of the temporary table in auth/ldap/auth.php's sync_users() function, there are two issues:
1) the width of the username column is set to 64, whereas mdl_user.username is 100 characters.
C.f. auth/ldap/auth.php version 1.47, lines 538, 543, 549, 555

2) there is no reference to the mnethostid in the temporary table.
This second issue becomes a significant problem when sync_users() performs the User Additions section. Because mdl_user.username is only indexed in mdl_user_mneuse_uix, the select is EXTREMELY slow and resource intensive. MySQL would lock up for several minutes on our site. I added a nonunique index to mdl_user.username to avoid the issue. Alternately, the local mnethostid could be utilized in the temptable and SQL.
C.f. auth/ldap/auth.php version 1.47, line 789.

Issue Links

Activity

Hide
Petr Škoda (skodak) added a comment -

adding more ppl as watcher, if nobody steps up I will at least fix the field length, that mnethostid would need more thinking and testing I am afraid

Show
Petr Škoda (skodak) added a comment - adding more ppl as watcher, if nobody steps up I will at least fix the field length, that mnethostid would need more thinking and testing I am afraid
Hide
Iñaki Arenaza added a comment -

I don't think adding a non-unique index to mdl_user.username is worth it. There are only a very few places in the code (7 in HEAD as of today) where we query mdl_user with only the username and no additional indexed field. By the way, I think those places should be fixed, as nearly all of them could just add the mnethostid field to the query with the value of $CFG->mnet_localhost_id. You can find those places with the following command:

rgrep "get_record *( *[\"']user[\"'] *, *[\"']username[\"']" *

In this particular case, adding the local mnethostid to the temporary table and the SQL queries is the most sensible thing.

Saludos. Iñaki.

Show
Iñaki Arenaza added a comment - I don't think adding a non-unique index to mdl_user.username is worth it. There are only a very few places in the code (7 in HEAD as of today) where we query mdl_user with only the username and no additional indexed field. By the way, I think those places should be fixed, as nearly all of them could just add the mnethostid field to the query with the value of $CFG->mnet_localhost_id. You can find those places with the following command: rgrep "get_record *( *[\"']user[\"'] *, *[\"']username[\"']" * In this particular case, adding the local mnethostid to the temporary table and the SQL queries is the most sensible thing. Saludos. Iñaki.
Hide
Martin Dougiamas added a comment -

Sorry we're out of time for 1.9 -> bumping this to 1.9.1

Show
Martin Dougiamas added a comment - Sorry we're out of time for 1.9 -> bumping this to 1.9.1
Hide
Bruce Webster added a comment -

Running auth_ldap_sync_users.php with over 24000 users gets stuck on this one query for 37 Minutes!!! O users^2 rows. See below. I reduced this to about 1 or 2 seconds by adding an index to username in mdl_user.

  1. Query_time: 2213 Lock_time: 0 Rows_sent: 23 Rows_examined: 290899001
    use moodle_19_stable;
    SELECT e.username, e.username FROM mdl_extuser e LEFT JOIN mdl_user u ON e.username = u.username WHERE u.id IS NULL;
Show
Bruce Webster added a comment - Running auth_ldap_sync_users.php with over 24000 users gets stuck on this one query for 37 Minutes!!! O users^2 rows. See below. I reduced this to about 1 or 2 seconds by adding an index to username in mdl_user.
  1. Query_time: 2213 Lock_time: 0 Rows_sent: 23 Rows_examined: 290899001 use moodle_19_stable; SELECT e.username, e.username FROM mdl_extuser e LEFT JOIN mdl_user u ON e.username = u.username WHERE u.id IS NULL;
Hide
Suud Ridzuan added a comment -

Using 1.9.4+ with LDAP users of 2236 teachers and 24206 students. Takes more than 4 secs for each user using auth_ldap_sync_users.php. In the end, it took more than 24 hours.

Slapd always stays at 100 core utilization but MySQL is almost 0.

I hope to have a fix on this as it is now a great hurdle to setup Moodle instances in my Institution.

Show
Suud Ridzuan added a comment - Using 1.9.4+ with LDAP users of 2236 teachers and 24206 students. Takes more than 4 secs for each user using auth_ldap_sync_users.php. In the end, it took more than 24 hours. Slapd always stays at 100 core utilization but MySQL is almost 0. I hope to have a fix on this as it is now a great hurdle to setup Moodle instances in my Institution.
Hide
Iñaki Arenaza added a comment -

The attached path could be used as a basis for a real solution. It does make a difference with MySQL (from 60+ minutes down to 1m40s for 20,000 LDAP users, with both mysql and ldap server on the local machine), although it doesn't seem to make a big difference with Postgres (just a few seconds, if any).

I'm not able to test with Oracle or MSSQL and I'm not even sure if the syntax is right for them, so someone more knowledgeable should have a look at those lines.

Saludos. Iñaki.

Show
Iñaki Arenaza added a comment - The attached path could be used as a basis for a real solution. It does make a difference with MySQL (from 60+ minutes down to 1m40s for 20,000 LDAP users, with both mysql and ldap server on the local machine), although it doesn't seem to make a big difference with Postgres (just a few seconds, if any). I'm not able to test with Oracle or MSSQL and I'm not even sure if the syntax is right for them, so someone more knowledgeable should have a look at those lines. Saludos. Iñaki.
Hide
Scott Karren added a comment -

This patch is for ldap/auth.php. Can this be incorporated into the auth.php for CAS? I have 30,000 + users that I am syncing nightly. With a non-unique index to mdl_user.username it still takes 20 minutes to sync all users. Not that I am complaining, before the index it would take 5 + hours. This is with LDAP and MySQL on different machines.

Show
Scott Karren added a comment - This patch is for ldap/auth.php. Can this be incorporated into the auth.php for CAS? I have 30,000 + users that I am syncing nightly. With a non-unique index to mdl_user.username it still takes 20 minutes to sync all users. Not that I am complaining, before the index it would take 5 + hours. This is with LDAP and MySQL on different machines.
Hide
Iñaki Arenaza added a comment -

Sure it can be incorporated Scott . I'll update the patch in a few hours.

Saludos. Iñaki.

Show
Iñaki Arenaza added a comment - Sure it can be incorporated Scott . I'll update the patch in a few hours. Saludos. Iñaki.
Hide
Scott Karren added a comment -

Great, I'd like to test this and see how it performs by removing the index I created.

Scott

Show
Scott Karren added a comment - Great, I'd like to test this and see how it performs by removing the index I created. Scott
Hide
Iñaki Arenaza added a comment -

Scott,

I wouln't expect any major peformance gain with respect to your current setup (with an index on mdl_user.username).

Iñaki.

Show
Iñaki Arenaza added a comment - Scott, I wouln't expect any major peformance gain with respect to your current setup (with an index on mdl_user.username). Iñaki.
Hide
Iñaki Arenaza added a comment -

Scott,

here's an updated patch that includes the same fix for CAS. I have no way to test the fix with CAS as I don't have access to a CAS server, but given the similarities between the LDAP code and the CAS code, it should work as expected (but try it on a test enviroment first )

Saludos. Iñaki.

Show
Iñaki Arenaza added a comment - Scott, here's an updated patch that includes the same fix for CAS. I have no way to test the fix with CAS as I don't have access to a CAS server, but given the similarities between the LDAP code and the CAS code, it should work as expected (but try it on a test enviroment first ) Saludos. Iñaki.
Hide
Iñaki Arenaza added a comment -

Scott,

were you finally able to test the patch?

Saludos.
Iñaki.

Show
Iñaki Arenaza added a comment - Scott, were you finally able to test the patch? Saludos. Iñaki.
Hide
Mathieu Petit-Clair added a comment -

Bonjour Iñaki,

I just tested the patch ...

  • I had a slow query every time the ldap sync would run - not anymore
  • the whole site would slow down for at least a few minutes - not anymore

Looks good to me...

Mat

Show
Mathieu Petit-Clair added a comment - Bonjour Iñaki, I just tested the patch ...
  • I had a slow query every time the ldap sync would run - not anymore
  • the whole site would slow down for at least a few minutes - not anymore
Looks good to me... Mat
Hide
Iñaki Arenaza added a comment -

Thanks Mathieu,

I'd like Eloy (or someone that knows MS SQL and Oracle) could have a look at it so we could apply it

Saludos,
Iñaki.

Show
Iñaki Arenaza added a comment - Thanks Mathieu, I'd like Eloy (or someone that knows MS SQL and Oracle) could have a look at it so we could apply it Saludos, Iñaki.
Hide
Séverin Terrier added a comment -

You can also see these threads, that are related :
http://moodle.org/mod/forum/discuss.php?d=69277
http://moodle.org/mod/forum/discuss.php?d=63874

For me, adding "CHARACTER SET utf8" (and no collation type) fixed the problem...

Show
Séverin Terrier added a comment - You can also see these threads, that are related : http://moodle.org/mod/forum/discuss.php?d=69277 http://moodle.org/mod/forum/discuss.php?d=63874 For me, adding "CHARACTER SET utf8" (and no collation type) fixed the problem...
Hide
Jeffery Watkins added a comment -

Can this patch be reversed by replacing the changed auth.php file with the standard or are changes made in the database structure?

Show
Jeffery Watkins added a comment - Can this patch be reversed by replacing the changed auth.php file with the standard or are changes made in the database structure?
Hide
Iñaki Arenaza added a comment -

Hi Jeffry,

no changes are made in the database structure, so it's safe to replace the changed file with the standard one.

Saludos,
Iñaki.

Show
Iñaki Arenaza added a comment - Hi Jeffry, no changes are made in the database structure, so it's safe to replace the changed file with the standard one. Saludos, Iñaki.
Hide
Séverin Terrier added a comment -

I had this problem on my Moodle (nearly 25000 users), and i've updated with the patch.

Before : updating users : 1 mn ; creating new users : 35 mn
After : 1 mn for the whole process

Would be cool to have it include in 1.9...

Show
Séverin Terrier added a comment - I had this problem on my Moodle (nearly 25000 users), and i've updated with the patch. Before : updating users : 1 mn ; creating new users : 35 mn After : 1 mn for the whole process Would be cool to have it include in 1.9...
Hide
Dan Poltawski added a comment -

Eloy: Have you seen this patch? It seems like something we should really get integrated!

Show
Dan Poltawski added a comment - Eloy: Have you seen this patch? It seems like something we should really get integrated!
Hide
Jonathan Robson added a comment -

Why does this issue say that it's fixed in 1.9.8 when the status is still "Open" and the patch(es) do not appear to have been committed?

Show
Jonathan Robson added a comment - Why does this issue say that it's fixed in 1.9.8 when the status is still "Open" and the patch(es) do not appear to have been committed?
Hide
Jonathan Robson added a comment -

Sorry. I meant to add... Is this no longer an issue in 1.9.8?

Show
Jonathan Robson added a comment - Sorry. I meant to add... Is this no longer an issue in 1.9.8?
Hide
Iñaki Arenaza added a comment -

Hi Jonathan,

As long as the bug remains in the 'Open' or 'Work in progress' state, the 'Fix Version/s' is the version in which the bug is planned to be fixed.

It refers to the version in which the bug was actually fixed only when the bug is in 'Closed' or 'Resolved' state.

Saludos,
Iñaki.

Show
Iñaki Arenaza added a comment - Hi Jonathan, As long as the bug remains in the 'Open' or 'Work in progress' state, the 'Fix Version/s' is the version in which the bug is planned to be fixed. It refers to the version in which the bug was actually fixed only when the bug is in 'Closed' or 'Resolved' state. Saludos, Iñaki.
Hide
Iñaki Arenaza added a comment -

Just a small note to comment that this has been as part of the LDAP and CAS refactor that went into 2.0 a few weeks ago.

Unfortunately it's still pending on 1.9.x, waiting for some OK's (mainly from Eloy or someone else that can confirm that the SQL for the temp table creation is ok).

Saludos.
Iñaki.

Show
Iñaki Arenaza added a comment - Just a small note to comment that this has been as part of the LDAP and CAS refactor that went into 2.0 a few weeks ago. Unfortunately it's still pending on 1.9.x, waiting for some OK's (mainly from Eloy or someone else that can confirm that the SQL for the temp table creation is ok). Saludos. Iñaki.
Hide
Eloy Lafuente (stronk7) added a comment -

Oops, just, thanks to Iñaki's previous comment, this issue became visible in my tracker flood... looking to the patches it seems that all you are doing is to add the mnethostid column to the temp table and queries, correct? If the previous version was working ok (the temptables create/drop), I cannot see anything against this to land in 19_STABLE.

Said that, Moodle 2.0 has proper temptables support, so I hope those "harcoded-manual" temptables statements aren't being used there (but correct manager->create_temp_table() and manager->drop_temp_table() instead).

So, if the diffs above are just that (add new column) I'd vote +1 for that.

Thanks and ciao

Show
Eloy Lafuente (stronk7) added a comment - Oops, just, thanks to Iñaki's previous comment, this issue became visible in my tracker flood... looking to the patches it seems that all you are doing is to add the mnethostid column to the temp table and queries, correct? If the previous version was working ok (the temptables create/drop), I cannot see anything against this to land in 19_STABLE. Said that, Moodle 2.0 has proper temptables support, so I hope those "harcoded-manual" temptables statements aren't being used there (but correct manager->create_temp_table() and manager->drop_temp_table() instead). So, if the diffs above are just that (add new column) I'd vote +1 for that. Thanks and ciao
Hide
Iñaki Arenaza added a comment -

Hi Eloy,

yes, your assessment of the patch is right, and it works OK in MySQL and Postgresql (the db's I've been able to test with). But I don't have MS SQL and Oracle at hand, and I don't master their SQL syntax peculiarities with respect to temporary tables. So I was asking for someone with more knowledge than me in that area to have a look at the SQL syntax used for temp table creation with MS SQL and Oracle.

With respect to Moodle 2.0, current code actually uses manager->{create_temp_table,drop_temp_table} as you told me a long time ago

Saludos.
Iñaki.

Show
Iñaki Arenaza added a comment - Hi Eloy, yes, your assessment of the patch is right, and it works OK in MySQL and Postgresql (the db's I've been able to test with). But I don't have MS SQL and Oracle at hand, and I don't master their SQL syntax peculiarities with respect to temporary tables. So I was asking for someone with more knowledge than me in that area to have a look at the SQL syntax used for temp table creation with MS SQL and Oracle. With respect to Moodle 2.0, current code actually uses manager->{create_temp_table,drop_temp_table} as you told me a long time ago Saludos. Iñaki.
Hide
Séverin Terrier added a comment -

This problem is still here in Moodle 1.9.13+ from today (with MySQL 5.1.12).
It's very annoying, as new users are not created
As i said 2 years ago, just adding "CHARACTER SET utf8" allows new users to be created
It would be good to be added (at least for MySQL, even if it doesn't correct it for all databases).

Show
Séverin Terrier added a comment - This problem is still here in Moodle 1.9.13+ from today (with MySQL 5.1.12). It's very annoying, as new users are not created As i said 2 years ago, just adding "CHARACTER SET utf8" allows new users to be created It would be good to be added (at least for MySQL, even if it doesn't correct it for all databases).
Hide
Jonathan Moore added a comment -

We would like to see this fixed to. We have sites where this is still happening as well.

Show
Jonathan Moore added a comment - We would like to see this fixed to. We have sites where this is still happening as well.
Hide
Dakota Duff added a comment -

We applied the patch manually to a 1.9.8 site and saw a significant improvement. Prior to change the site was crashing after the sync ran for a few hours. After the change the entire sync completed in five or six minutes.

Show
Dakota Duff added a comment - We applied the patch manually to a 1.9.8 site and saw a significant improvement. Prior to change the site was crashing after the sync ran for a few hours. After the change the entire sync completed in five or six minutes.
Hide
Justin Filip added a comment -

I've got a Github branch where I've taken this code and applied it to the current MOODLE_19_STABLE code. I also took the liberty of re-writing the temp table creation to use XMLDB and the create_temporary_table() DDL function.

I also gathered some performance numbers from MySQL (5.1.58) and Postgres (9.0.5) adding 117 incoming LDAP user accounts into Moodle:

  • Branch DB Number of users already existing in Moodle Time (seconds)
    MOODLE_19_STABLE MySQL 1 26.066378
    MDL-13458_m1 MySQL 1 25.595701
    MOODLE_19_STABLE MySQL 99,811 40.808697
    MDL-13458_m1 MySQL 99,811 25.61432
    MOODLE_19_STABLE Postgres 1 26.031088
    MDL-13458_m1 Postgres 1 26.617801
    MOODLE_19_STABLE Postgres 99085 29.79877
    MDL-13458_m1 Postgres 99085 27.743119
Show
Justin Filip added a comment - I've got a Github branch where I've taken this code and applied it to the current MOODLE_19_STABLE code. I also took the liberty of re-writing the temp table creation to use XMLDB and the create_temporary_table() DDL function. I also gathered some performance numbers from MySQL (5.1.58) and Postgres (9.0.5) adding 117 incoming LDAP user accounts into Moodle:
  • Branch DB Number of users already existing in Moodle Time (seconds)
    MOODLE_19_STABLE MySQL 1 26.066378
    MDL-13458_m1 MySQL 1 25.595701
    MOODLE_19_STABLE MySQL 99,811 40.808697
    MDL-13458_m1 MySQL 99,811 25.61432
    MOODLE_19_STABLE Postgres 1 26.031088
    MDL-13458_m1 Postgres 1 26.617801
    MOODLE_19_STABLE Postgres 99085 29.79877
    MDL-13458_m1 Postgres 99085 27.743119
Hide
Justin Filip added a comment -

Added Github patch info into appropriate fields.

Show
Justin Filip added a comment - Added Github patch info into appropriate fields.
Hide
Justin Filip added a comment -

Here are some results from running the updated patch on a very large client installation we host. Using stock Moodle code it took 3 hours 44 minutes to complete the LDAP user sync. With the patch it took 17 minutes to complete.

stock Moodle auth.php:

  • Got 124146 records from LDAP
  • User entries to update: 184401
  • /data/html/auth/ldap/auth_ldap_sync_users.php time: 13484.938467s memory_total: 48168176B (45.9MB) memory_growth: 47806608B (45.6MB) includecount: 121 ticks: 1348494 user: 284984 sys: 22922
    cuser: 0 csys: 0 serverload: 0.38 rcache: 0/0
    

Using MDL-13458:

  • Got 126425 records from LDAP
  • User entries to update: 184468
  • /data/html/auth/ldap/auth_ldap_sync_users.php time: 1021.767979s memory_total: 33858952B (32.3MB) memory_growth: 33497384B (31.9MB) includecount: 63 ticks: 102177 user: 17333 sys: 3054
    cuser: 0 csys: 0 serverload: 0.16 rcache: 0/0
    
Show
Justin Filip added a comment - Here are some results from running the updated patch on a very large client installation we host. Using stock Moodle code it took 3 hours 44 minutes to complete the LDAP user sync. With the patch it took 17 minutes to complete.

stock Moodle auth.php:

  • Got 124146 records from LDAP
  • User entries to update: 184401
  • /data/html/auth/ldap/auth_ldap_sync_users.php time: 13484.938467s memory_total: 48168176B (45.9MB) memory_growth: 47806608B (45.6MB) includecount: 121 ticks: 1348494 user: 284984 sys: 22922
    cuser: 0 csys: 0 serverload: 0.38 rcache: 0/0
    

Using MDL-13458:

  • Got 126425 records from LDAP
  • User entries to update: 184468
  • /data/html/auth/ldap/auth_ldap_sync_users.php time: 1021.767979s memory_total: 33858952B (32.3MB) memory_growth: 33497384B (31.9MB) includecount: 63 ticks: 102177 user: 17333 sys: 3054
    cuser: 0 csys: 0 serverload: 0.16 rcache: 0/0