Moodle Community Sites

new git.moodle.org mirror

Details

  • Type: New Feature New Feature
  • Status: Resolved Resolved
  • Priority: Minor Minor
  • Resolution: Fixed
  • Component/s: CVS repository
  • Labels:
    None

Description

now that git usage is ramping up, it would be really good if we could have an official git.moodle.org mirror of the cvs repo

Catalyst have one at http://git.catalyst.net.nz/ but that site contains a whole lot of extra Catalyst specific branches not useful to Moodle users,

We've created a proof of concept with just the moodle stuff and a moodle themed gitweb here:
http://moodle.git.catalyst.net.nz

Catalyst would be happy to host an official git mirror, but in conversations with MD, this is something that could be better hosted by HQ - the Catalyst repo is synchronized every hour, but it could potentially be synchronized every 5min if hosted on the same server as cvs.moodle.org

when/if HQ decide to host this internally - it would be really good if HQ could base the install on the existing Catalyst git repo so that all the commit ids remain the same and not cause conflicts with other people currently using the catalyst public git repo - if/When HQ hosts a git.moodle.org we will change our systems at Catalyst to sync with the git.moodle.org instead of our locally generated import.

  1. check_drift.sh
    21/Apr/09 12:01 PM
    0.5 kB
    Francois Marier
  2. moodle-gitimport.sh
    21/Apr/09 12:01 PM
    2 kB
    Francois Marier
  3. moodlemirror.sh
    21/Apr/09 12:02 PM
    0.5 kB
    Francois Marier
  4. smartdiff
    21/Apr/09 12:01 PM
    0.6 kB
    Francois Marier
  5. tracker_links.patch
    28/Apr/09 2:34 PM
    0.5 kB
    Francois Marier

Activity

Hide
Dan Marsden added a comment -

adding Eloy as a watcher - also there are a couple of people at Catalyst that would be happy to help anyone from HQ set this up, and we are happy to share the code we use for drift checking (and our documentation on how to fix drift!)

Show
Dan Marsden added a comment - adding Eloy as a watcher - also there are a couple of people at Catalyst that would be happy to help anyone from HQ set this up, and we are happy to share the code we use for drift checking (and our documentation on how to fix drift!)
Hide
Martin Dougiamas added a comment -

Awesome, thanks. I'd like us to set this up next week. I'm guessing the resource requirements are pretty low - so that we can put it on the same server as cvs.moodle.org.

Is the drift caused by the current 1 hour lag or something else?

Show
Martin Dougiamas added a comment - Awesome, thanks. I'd like us to set this up next week. I'm guessing the resource requirements are pretty low - so that we can put it on the same server as cvs.moodle.org. Is the drift caused by the current 1 hour lag or something else?
Hide
Francois Marier added a comment -

The resouce requirements are not very big, but we probably keep it to once every hour at least initially. It could probably be increased to once every 30 minutes at some point.

Here's what happens every hour:

1- a local CVS mirror is updated
2- git-cvs exports all of the CVS commits from that mirror to a text file
3- git-cvs imports these commits from the text file into its own repository
4- new commits in the importer repository are pushed to the public git repository

We also have a daily cron job which checks for drift between CVS and git by doing a checkout from both and then running a diff over them.

The drift isn't caused by the 1 hour lag, but rather by something that git-cvs is not being able to figure out. It requires a manual fixup once in a while.

Show
Francois Marier added a comment - The resouce requirements are not very big, but we probably keep it to once every hour at least initially. It could probably be increased to once every 30 minutes at some point. Here's what happens every hour: 1- a local CVS mirror is updated 2- git-cvs exports all of the CVS commits from that mirror to a text file 3- git-cvs imports these commits from the text file into its own repository 4- new commits in the importer repository are pushed to the public git repository We also have a daily cron job which checks for drift between CVS and git by doing a checkout from both and then running a diff over them. The drift isn't caused by the 1 hour lag, but rather by something that git-cvs is not being able to figure out. It requires a manual fixup once in a while.
Hide
Martin Dougiamas added a comment -

Thanks for that info, very helpful.

The bit about git-cvs not being able to figure out simple patches is concerning! What kind of things do you mean? How are they fixed?

I'd be a bit worried about sending bad/imcomplete/old data to developers using git ...

Show
Martin Dougiamas added a comment - Thanks for that info, very helpful. The bit about git-cvs not being able to figure out simple patches is concerning! What kind of things do you mean? How are they fixed? I'd be a bit worried about sending bad/imcomplete/old data to developers using git ...
Hide
Dan Poltawski added a comment -

About the drift - it has to be manually fixed to get them back in sync and does disrupt history a bit:
http://git.catalyst.net.nz/gw?p=moodle-r2.git&a=search&h=HEAD&st=commit&s=cvsimport
http://git.catalyst.net.nz/gw?p=moodle-r2.git&a=search&h=HEAD&st=commit&s=drift

I agree about HQ git being based off the catalyst git repo hashes. But if/when we move to git as the primary repository that we should consider breaking the existing commit ids by creating a pristine copy of the cvs history and fixing drift issues which come up one by one to ensure the correct history remains.

Show
Dan Poltawski added a comment - About the drift - it has to be manually fixed to get them back in sync and does disrupt history a bit: http://git.catalyst.net.nz/gw?p=moodle-r2.git&a=search&h=HEAD&st=commit&s=cvsimport http://git.catalyst.net.nz/gw?p=moodle-r2.git&a=search&h=HEAD&st=commit&s=drift I agree about HQ git being based off the catalyst git repo hashes. But if/when we move to git as the primary repository that we should consider breaking the existing commit ids by creating a pristine copy of the cvs history and fixing drift issues which come up one by one to ensure the correct history remains.
Hide
Francois Marier added a comment -

While I don't know what is the underlying cause of it, I have noticed that the drift is often caused by large deletions in HEAD. Drift in other branches seems to be much less frequent.

This is quite easy to detect however since the output of the drift detection script I wrote is a diff between CVS and git. Fixing the drift simply involves committing that diff as a "fixup" commit (as noted by Dan).

So it's a straightforward operation and because Catalyst monitors the repo every day, any minor drift like we have seen in the recent past got fixed within a day or two.

I agree that it would be better to send 100% accurate data, which is easily done by switching to git as the main development VCS for core

On the other hand, I don't think this minor drift is a big problem for the time being:

  • drift is now detected reliably
  • it's not frequent (with the exception of last month where I noticed that my drift detection script missed a few things)
  • it's fixed quickly
  • it mostly affects HEAD
  • often has to do with deleted files
Show
Francois Marier added a comment - While I don't know what is the underlying cause of it, I have noticed that the drift is often caused by large deletions in HEAD. Drift in other branches seems to be much less frequent. This is quite easy to detect however since the output of the drift detection script I wrote is a diff between CVS and git. Fixing the drift simply involves committing that diff as a "fixup" commit (as noted by Dan). So it's a straightforward operation and because Catalyst monitors the repo every day, any minor drift like we have seen in the recent past got fixed within a day or two. I agree that it would be better to send 100% accurate data, which is easily done by switching to git as the main development VCS for core On the other hand, I don't think this minor drift is a big problem for the time being:
  • drift is now detected reliably
  • it's not frequent (with the exception of last month where I noticed that my drift detection script missed a few things)
  • it's fixed quickly
  • it mostly affects HEAD
  • often has to do with deleted files
Hide
Martin Dougiamas added a comment -

Could you send us your scripts and anything else Jordan needs to get started?

Show
Martin Dougiamas added a comment - Could you send us your scripts and anything else Jordan needs to get started?
Hide
Francois Marier added a comment -

These are the scripts we use for the Catalyst git-cvs sync.

Here is the crontab for the "gitimport" user (which owns all of the repo files):

MAILTO=francois@catalyst.net.nz

          1. MOODLE #############
            42 * * * * ~/scripts/bin/moodlemirror.sh
  1. this runs 50 minutes past the hour, closely after the rsync
  2. cvsimport will skip commits in the last 10m which
  3. means that we're <mostly> ok if rsync leaves a truncated commit
    50 * * * * (date; ~/scripts/bin/moodle-gitimport.sh 2>&1 ) >> /git/import/moodle/log/gitimport.log || echo "Moodle sync failed"
  1. every day at 4am we check to make sure that we don't have any drift
  2. between upstream CVS and our git repo
    0 4 * * * ~/scripts/bin/check_drift.sh
Show
Francois Marier added a comment - These are the scripts we use for the Catalyst git-cvs sync. Here is the crontab for the "gitimport" user (which owns all of the repo files): MAILTO=francois@catalyst.net.nz
          1. MOODLE ############# 42 * * * * ~/scripts/bin/moodlemirror.sh
  1. this runs 50 minutes past the hour, closely after the rsync
  2. cvsimport will skip commits in the last 10m which
  3. means that we're <mostly> ok if rsync leaves a truncated commit 50 * * * * (date; ~/scripts/bin/moodle-gitimport.sh 2>&1 ) >> /git/import/moodle/log/gitimport.log || echo "Moodle sync failed"
  1. every day at 4am we check to make sure that we don't have any drift
  2. between upstream CVS and our git repo 0 4 * * * ~/scripts/bin/check_drift.sh
Hide
Francois Marier added a comment -

For completeness' sake, this is the mirror script we use for the CVS repo.

Show
Francois Marier added a comment - For completeness' sake, this is the mirror script we use for the CVS repo.
Hide
Eloy Lafuente (stronk7) added a comment -

Just for reference, since some months ago I've git cvsimport running here weekly and I've had exactly the same missing revision than the manually fixed in nz repo.

Also, I've found this, about missing revisions... perhaps that could be the cause:

http://stackoverflow.com/questions/702980/why-is-git-cvsimport-missing-one-major-patchset

I'll re-import everything here with -p -x to see if that fixes the problem in the long term... so I don't know if it has other implications.

Also, about keeping NZ git hashes... what will happen once the moodle.org read-only repo start running its own (hourly/daily) cvsimports ? New hashes will be definitively different from that point? Just guessing if it won't be better to have that "pristine" copy since day 0 instead of relaying in Catalyst repo for initial checkout.

Re-note I don't know too much about git, just commenting, asking... ciao

Show
Eloy Lafuente (stronk7) added a comment - Just for reference, since some months ago I've git cvsimport running here weekly and I've had exactly the same missing revision than the manually fixed in nz repo. Also, I've found this, about missing revisions... perhaps that could be the cause: http://stackoverflow.com/questions/702980/why-is-git-cvsimport-missing-one-major-patchset I'll re-import everything here with -p -x to see if that fixes the problem in the long term... so I don't know if it has other implications. Also, about keeping NZ git hashes... what will happen once the moodle.org read-only repo start running its own (hourly/daily) cvsimports ? New hashes will be definitively different from that point? Just guessing if it won't be better to have that "pristine" copy since day 0 instead of relaying in Catalyst repo for initial checkout. Re-note I don't know too much about git, just commenting, asking... ciao
Hide
Francois Marier added a comment -

That -x is pretty interesting. I might do some tests with that (keeping two syncs in parallel and checking for differences...).

The plan at the moment is to base the Moodle.org repo off of the Catalyst one and to disable the Catalyst importer once the Moodle one is ready to go.

Show
Francois Marier added a comment - That -x is pretty interesting. I might do some tests with that (keeping two syncs in parallel and checking for differences...). The plan at the moment is to base the Moodle.org repo off of the Catalyst one and to disable the Catalyst importer once the Moodle one is ready to go.
Hide
Eloy Lafuente (stronk7) added a comment -

Hi Francois,

I've ended my new "cvsimport -p -x" and it continues having missing bits.

For example, the "/search/querylib.php" is missing this line in my git import:

+* @version prepared for 2.0

Looking at CVS, it seems that there are two commits with the same message:

http://cvs.moodle.org/moodle/search/querylib.php?view=log (revisions 1.14 and 1.15)

And curiously... it seems that also both commits are in the NZ git repo:

http://git.catalyst.net.nz/gw?p=moodle-r2.git;a=history;f=search/querylib.php

but in some sort of reversed order!! So Dan had to drift it some hours later to keep it ok.

So... it seems that the cvsimport script has some sort of "time bug" or so, for commits having the same message, or whatever... who knows... just for your info... ciao

Show
Eloy Lafuente (stronk7) added a comment - Hi Francois, I've ended my new "cvsimport -p -x" and it continues having missing bits. For example, the "/search/querylib.php" is missing this line in my git import: +* @version prepared for 2.0 Looking at CVS, it seems that there are two commits with the same message: http://cvs.moodle.org/moodle/search/querylib.php?view=log (revisions 1.14 and 1.15) And curiously... it seems that also both commits are in the NZ git repo: http://git.catalyst.net.nz/gw?p=moodle-r2.git;a=history;f=search/querylib.php but in some sort of reversed order!! So Dan had to drift it some hours later to keep it ok. So... it seems that the cvsimport script has some sort of "time bug" or so, for commits having the same message, or whatever... who knows... just for your info... ciao
Hide
Eloy Lafuente (stronk7) added a comment -

Just found this:

http://www.kernel.org/pub/software/scm/git/docs/git-cvsimport.html#issues

perhaps related (as commits are really near). Re-ciao

Show
Eloy Lafuente (stronk7) added a comment - Just found this: http://www.kernel.org/pub/software/scm/git/docs/git-cvsimport.html#issues perhaps related (as commits are really near). Re-ciao
Hide
Francois Marier added a comment -

The gitweb interface is up and running:

http://git.moodle.org

The sync is now happening every 30 minutes.

All that's left is to figure out how we want to let people clone the repo (using mirrors or not?).

Show
Francois Marier added a comment - The gitweb interface is up and running: http://git.moodle.org The sync is now happening every 30 minutes. All that's left is to figure out how we want to let people clone the repo (using mirrors or not?).
Hide
Francois Marier added a comment -

The only left is to enable the git daemon so that people/mirrors can pull changes in:

git-daemon --verbose --base-path=/git/public

Running the daemon manually I have just verified that it works fine and I was able to clone the repo using:

git clone git://git.moodle.org/moodle.git

Show
Francois Marier added a comment - The only left is to enable the git daemon so that people/mirrors can pull changes in: git-daemon --verbose --base-path=/git/public Running the daemon manually I have just verified that it works fine and I was able to clone the repo using: git clone git://git.moodle.org/moodle.git
Hide
Francois Marier added a comment -

The patch to gitweb that we use to turn MDL-xxxxx and CONTRIB-xxxxxx into links to the Moodle tracker.

Show
Francois Marier added a comment - The patch to gitweb that we use to turn MDL-xxxxx and CONTRIB-xxxxxx into links to the Moodle tracker.
Hide
Jordan Tomkinson added a comment -

Git-daemon is now running and activated on boot.

Show
Jordan Tomkinson added a comment - Git-daemon is now running and activated on boot.
Hide
Dan Marsden added a comment -

well done to Jordan and Francois getting this up and running! - I do like the breadcrumbs in my version better, but understand the reasons of not wanting to alter gitweb too much to make it difficult to upgrade in future!

well done!

Jordan - any chance of fixing the arrow gfx used in the drop down menus? - I'm guessing it's hardcoded to a relative path in the css on download.moodle.org ?

Show
Dan Marsden added a comment - well done to Jordan and Francois getting this up and running! - I do like the breadcrumbs in my version better, but understand the reasons of not wanting to alter gitweb too much to make it difficult to upgrade in future! well done! Jordan - any chance of fixing the arrow gfx used in the drop down menus? - I'm guessing it's hardcoded to a relative path in the css on download.moodle.org ?
Hide
Dan Poltawski added a comment -

Cool, i've now switched my git remotes to it and its definitely faster from here.

FWIW, here is a quick onliner for people to change their git configs:

find . -path '*/.git/config' | xargs perl -piwhoops1 -e 's{git://git.catalyst.net.nz/moodle-r2.git}{git://git.moodle.org/moodle.git}'

Show
Dan Poltawski added a comment - Cool, i've now switched my git remotes to it and its definitely faster from here. FWIW, here is a quick onliner for people to change their git configs: find . -path '*/.git/config' | xargs perl -piwhoops1 -e 's{git://git.catalyst.net.nz/moodle-r2.git}{git://git.moodle.org/moodle.git}'
Hide
Dan Poltawski added a comment -

Yay, so much fast than you poor guys in NZ -

git clone it://git.moodle.org/moodle.git
4m1.939s

git clone git://git.catalyst.net.nz/moodle-r2.git
13m16.854s

Show
Dan Poltawski added a comment - Yay, so much fast than you poor guys in NZ - git clone it://git.moodle.org/moodle.git 4m1.939s git clone git://git.catalyst.net.nz/moodle-r2.git 13m16.854s
Hide
Jay Knight added a comment -

What are the chances of getting a git mirror for /contrib?

Show
Jay Knight added a comment - What are the chances of getting a git mirror for /contrib?
Hide
Petr Škoda (skodak) added a comment -

The problem is that each contrib developer may want to have own git repo, I suppose that git is not designed to give access to subdirectories. I think it would be a big mess if we stored all contribs in one huge git repo. Each plugin should be contained in one directory only - easy to checkout as git submodule.

I have started to use github for my contrib projects and it works very well for me. I am planning to be sending only snapshots to the contrib cvs in case somebody does not want to use git yet.

Petr Skoda

Show
Petr Škoda (skodak) added a comment - The problem is that each contrib developer may want to have own git repo, I suppose that git is not designed to give access to subdirectories. I think it would be a big mess if we stored all contribs in one huge git repo. Each plugin should be contained in one directory only - easy to checkout as git submodule. I have started to use github for my contrib projects and it works very well for me. I am planning to be sending only snapshots to the contrib cvs in case somebody does not want to use git yet. Petr Skoda

Dates

  • Created:
    Updated:
    Resolved: