Page MenuHomeSoftware Heritage

smart, all-in-one git cloner/loader/ (+ dealing with updates too)
Closed, MigratedEdits Locked

Description

Merger of current git cloner + git loader that only retrieves from the remote git repository the new objects since the last time we visited.

Event Timeline

zack raised the priority of this task from to Normal.
zack updated the task description. (Show Details)
zack added a project: Developers.

Started playing with dulwich's git smart protocol client.

  • The HTTP "smart" client doesn't know how to read data from the server, and therefore sends all the commit history at once. Furthermore, it's completely buggy with Python3.
  • The git smart client seems to work well.

Trying an update on all of linux.git's refs makes the github server hang up the connection. It seems that it doesn't like when we ask for peeled refs, we therefore need to filter them out before asking for the missing refs.

Related but not limited to:
58903e5 * origin/master origin/HEAD Open occurrence_get(origin_id) to retrieve latest occurrences per origin
bc23eb9 * sql/upgrades/043: add 042→043 upgrade script
d05afde * revision_log from multiple root revisions
3a40f00 * sql/upgrades/042: add 041→042 upgrade script
f54fd8d * Open release_get_by to retrieve a release by origin.
5dc4244 * revision_get_by: branch name filtering is optional
7e623c8 * sql/upgrades/040: add 040→041 upgrade script
7e2dcbc * Open directory_get to retrieve information on directory by id

ardumont renamed this task from smart, all-in-one git cloner/loader to smart, all-in-one git cloner/loader/ (+ dealing with updates too).Jan 20 2016, 7:01 PM
ardumont changed the task status from Open to Work in Progress.
ardumont claimed this task.

For information, sample test_update.py adapted in swh-loader-git https://forge.softwareheritage.org/diffusion/DLDG/browse/master/swh/loader/git/updater.py to use the swh-storage.

(commit introducing updater.py 5d5f3ea)

ardumont changed the task status from Work in Progress to Open.Jan 22 2016, 10:05 AM
ardumont removed ardumont as the assignee of this task.
olasd claimed this task.

A new git updater, based on @ardumont's proof of concept, is now available in rDLDGIT.

This updater has been wired to a new task, and the workers have been updated to accept it.

olasd changed the visibility from "All Users" to "Public (No Login Required)".May 13 2016, 5:05 PM