Page MenuHomeSoftware Heritage
Feed Advanced Search

Mar 4 2016

zack added a subtask for T337: specify a manifest format for documenting archived software: T335: specify the URI scheme swh:... to point to software heritage objects.
Mar 4 2016, 1:05 PM · General
zack added a parent task for T335: specify the URI scheme swh:... to point to software heritage objects: T337: specify a manifest format for documenting archived software.
Mar 4 2016, 1:05 PM · General
zack created T337: specify a manifest format for documenting archived software.
Mar 4 2016, 1:05 PM · General
zack renamed T335: specify the URI scheme swh:... to point to software heritage objects from speicify the URI scheme swh:... to point to software heritage objects to specify the URI scheme swh:... to point to software heritage objects.
Mar 4 2016, 12:59 PM · General
zack renamed T335: specify the URI scheme swh:... to point to software heritage objects from URI scheme swh:... to point to software heritage objects to speicify the URI scheme swh:... to point to software heritage objects.
Mar 4 2016, 12:59 PM · General
zack created T336: "save code now".
Mar 4 2016, 12:57 PM · General
zack added a comment to T335: specify the URI scheme swh:... to point to software heritage objects.

As a first approximation, the URI scheme might be something like:

swh:VERSION:OBJECT_KIND:OBJECT_ID

with more specific instances like:

  • swh:1:content:SHA1 (for blobs)
  • swh:1:revision:SHA1 (commits)
  • swh:1:directory:SHA1 (directory trees)
  • swh:1:release:SHA1 (tags)
  • etc.
Mar 4 2016, 12:53 PM · General
zack created T335: specify the URI scheme swh:... to point to software heritage objects.
Mar 4 2016, 12:48 PM · General

Feb 29 2016

ardumont closed T302: swh-loader-tar origin validities are the current time instead of the mirroring time as Resolved.
Feb 29 2016, 3:58 PM · Tarball loader
ardumont added a comment to T302: swh-loader-tar origin validities are the current time instead of the mirroring time.

In a transaction with swhstorage user on softwareheritage db:

Feb 29 2016, 3:54 PM · Tarball loader
zack created T329: hg / mercurial loader.
Feb 29 2016, 3:23 PM · Mercurial loader
zack closed T29: evaluate conffile/argparse Python module as Spite.

It looks like we've moved on onto this, and settled for what we have.
We can reconsider in the future if specific needs arise.

Feb 29 2016, 3:17 PM · Core & foundations
zack added projects to T192: analyze 4 loading failures for GNU tarballs and reimport them: Tarball loader, Developers.
Feb 29 2016, 3:15 PM · Tarball loader
zack created T328: svn / subversion loader.
Feb 29 2016, 3:13 PM · SVN Loader
zack assigned T302: swh-loader-tar origin validities are the current time instead of the mirroring time to ardumont.
Feb 29 2016, 3:11 PM · Tarball loader
olasd closed T61: loading: trigger to update occurrence table as Resolved.

Done during the big postgres 9.5 upgrade window.

Feb 29 2016, 1:58 PM · Storage manager
zack added a project to T288: Open /api/1/revision/origin/<origin_id>/[branch/<branch_name>][ts/<ts>]/log/: Web app.
Feb 29 2016, 1:53 PM · Web app
zack added a project to T326: swh-scheduler: puppetize event listener: Language-Puppet.
Feb 29 2016, 1:44 PM · Puppet recipes, Language-Puppet, Scheduling utilities
zack added a project to T327: swh-worker: puppetize daemon: Language-Puppet.
Feb 29 2016, 1:43 PM · Puppet recipes, Language-Puppet, Scheduling utilities

Feb 27 2016

ardumont added a comment to T321: Download antepedia's s3 contents not in sesi nor in swh.

worker01 is in charge

  • 10 jobs working concurrently
  • 1 job is downloading sequentially 1000 s3 files
Feb 27 2016, 8:10 PM · Antelink loader
ardumont claimed T321: Download antepedia's s3 contents not in sesi nor in swh.
Feb 27 2016, 8:04 PM · Antelink loader
ardumont changed the status of T321: Download antepedia's s3 contents not in sesi nor in swh from Open to Work in Progress.
Feb 27 2016, 8:04 PM · Antelink loader
zack added a comment to T312: Gitorious import: ingest repositories.

Here is the complete list of URL that can be used to "git clone" (via HTTPS) all the repositories available from the Gitorious valhalla:

.

Feb 27 2016, 3:30 PM · Archive coverage, Restricted Project, Origin-Gitorious, Format-Git

Feb 26 2016

ardumont added a comment to T302: swh-loader-tar origin validities are the current time instead of the mirroring time.

There are also duplicates that need to be removed.
There should only be one origin_visit.
So the 187 number here should be 1.

Feb 26 2016, 10:55 AM · Tarball loader

Feb 25 2016

olasd created T327: swh-worker: puppetize daemon.
Feb 25 2016, 9:43 PM · Puppet recipes, Language-Puppet, Scheduling utilities
olasd created T326: swh-scheduler: puppetize event listener.
Feb 25 2016, 9:42 PM · Puppet recipes, Language-Puppet, Scheduling utilities
olasd created T325: swh-scheduler: puppetize task runner.
Feb 25 2016, 9:41 PM · Puppet recipes, Language-Puppet, Scheduling utilities
olasd triaged T315: swh-scheduler: add command-line interface as Normal priority.
Feb 25 2016, 9:40 PM · Scheduling utilities
ardumont added a comment to T302: swh-loader-tar origin validities are the current time instead of the mirroring time.

The gnu urls are of the form 'rsync://', that's the way to have hold on the origins to update.

Feb 25 2016, 1:48 PM · Tarball loader
ardumont added a comment to T302: swh-loader-tar origin validities are the current time instead of the mirroring time.

swh-loader-tar should use the time of mirroring instead of the current time to create the occurrences

Feb 25 2016, 1:43 PM · Tarball loader
ardumont closed T323: Check loader-dir's storage compatibility as Resolved.
Feb 25 2016, 1:37 PM · Directory loader
ardumont closed T324: Check loader-tar's storage compatibility as Resolved.
Feb 25 2016, 1:37 PM · Tarball loader

Feb 24 2016

ardumont claimed T324: Check loader-tar's storage compatibility.
Feb 24 2016, 5:23 PM · Tarball loader
ardumont claimed T323: Check loader-dir's storage compatibility.
Feb 24 2016, 5:22 PM · Directory loader
ardumont updated the task description for T323: Check loader-dir's storage compatibility.
Feb 24 2016, 5:02 PM · Directory loader
ardumont created T324: Check loader-tar's storage compatibility.
Feb 24 2016, 4:06 PM · Tarball loader
ardumont created T323: Check loader-dir's storage compatibility.
Feb 24 2016, 4:06 PM · Directory loader
Herald added a project to T302: swh-loader-tar origin validities are the current time instead of the mirroring time: Developers.

Hint: Use the TIMESTAMP file which holds the date of the last rsync.

Feb 24 2016, 4:02 PM · Tarball loader

Feb 23 2016

ardumont closed T317: Inject sesi files hashes in antelink db, a subtask of T322: Inject antepedia contents (backuped in sesi and not in swh) into swh, as Resolved.
Feb 23 2016, 7:14 PM · Antelink loader
ardumont closed T317: Inject sesi files hashes in antelink db as Resolved.
Feb 23 2016, 7:14 PM · Antelink loader
ardumont closed T317: Inject sesi files hashes in antelink db, a subtask of T321: Download antepedia's s3 contents not in sesi nor in swh, as Resolved.
Feb 23 2016, 7:14 PM · Antelink loader
ardumont updated the task description for T317: Inject sesi files hashes in antelink db.
Feb 23 2016, 6:36 PM · Antelink loader
ardumont closed T316: List and compute hashes of actual sesi files as Resolved.
Feb 23 2016, 6:35 PM · Antelink loader
ardumont closed T316: List and compute hashes of actual sesi files, a subtask of T317: Inject sesi files hashes in antelink db, as Resolved.
Feb 23 2016, 6:35 PM · Antelink loader
olasd added a comment to T315: swh-scheduler: add command-line interface.

We now have a beginning of implementation in rDSCH v0.0.4

Feb 23 2016, 5:52 PM · Scheduling utilities
ardumont merged task T319: S3 content files downloader and injection in swh into T308: retrieve content from s3 and store it in SWH storage.
Feb 23 2016, 4:12 PM · Antelink loader
ardumont merged T319: S3 content files downloader and injection in swh into T308: retrieve content from s3 and store it in SWH storage.
Feb 23 2016, 4:12 PM · Storage manager, Antelink loader
ardumont added a parent task for T320: sesi content downloader and injection in swh worker: T322: Inject antepedia contents (backuped in sesi and not in swh) into swh.
Feb 23 2016, 4:10 PM · Antelink loader
ardumont added a parent task for T317: Inject sesi files hashes in antelink db: T322: Inject antepedia contents (backuped in sesi and not in swh) into swh.
Feb 23 2016, 4:10 PM · Antelink loader
ardumont added subtasks for T322: Inject antepedia contents (backuped in sesi and not in swh) into swh: T317: Inject sesi files hashes in antelink db, T320: sesi content downloader and injection in swh worker.
Feb 23 2016, 4:10 PM · Antelink loader
ardumont created T322: Inject antepedia contents (backuped in sesi and not in swh) into swh.
Feb 23 2016, 4:10 PM · Antelink loader
ardumont added subtasks for T321: Download antepedia's s3 contents not in sesi nor in swh: T317: Inject sesi files hashes in antelink db, T319: S3 content files downloader and injection in swh.
Feb 23 2016, 4:09 PM · Antelink loader
ardumont added a parent task for T317: Inject sesi files hashes in antelink db: T321: Download antepedia's s3 contents not in sesi nor in swh.
Feb 23 2016, 4:09 PM · Antelink loader
ardumont created T321: Download antepedia's s3 contents not in sesi nor in swh.
Feb 23 2016, 4:08 PM · Antelink loader
ardumont closed T320: sesi content downloader and injection in swh worker as Resolved.
Feb 23 2016, 4:07 PM · Antelink loader
ardumont renamed T319: S3 content files downloader and injection in swh from S3 content files downloader code to S3 content files downloader and injection in swh.
Feb 23 2016, 4:07 PM · Antelink loader
ardumont created T320: sesi content downloader and injection in swh worker.
Feb 23 2016, 4:06 PM · Antelink loader
ardumont removed a subtask for T319: S3 content files downloader and injection in swh: T317: Inject sesi files hashes in antelink db.
Feb 23 2016, 4:05 PM · Antelink loader
ardumont removed a parent task for T317: Inject sesi files hashes in antelink db: T319: S3 content files downloader and injection in swh.
Feb 23 2016, 4:05 PM · Antelink loader
ardumont renamed T319: S3 content files downloader and injection in swh from S3 content files downloader to S3 content files downloader code.
Feb 23 2016, 4:05 PM · Antelink loader
ardumont added a parent task for T316: List and compute hashes of actual sesi files: T317: Inject sesi files hashes in antelink db.
Feb 23 2016, 4:04 PM · Antelink loader
ardumont added a subtask for T317: Inject sesi files hashes in antelink db: T316: List and compute hashes of actual sesi files.
Feb 23 2016, 4:04 PM · Antelink loader
ardumont added a subtask for T319: S3 content files downloader and injection in swh: T317: Inject sesi files hashes in antelink db.
Feb 23 2016, 4:04 PM · Antelink loader
ardumont added a parent task for T317: Inject sesi files hashes in antelink db: T319: S3 content files downloader and injection in swh.
Feb 23 2016, 4:04 PM · Antelink loader
ardumont created T319: S3 content files downloader and injection in swh.
Feb 23 2016, 4:04 PM · Antelink loader
ardumont created T317: Inject sesi files hashes in antelink db.
Feb 23 2016, 4:02 PM · Antelink loader
ardumont created T316: List and compute hashes of actual sesi files.
Feb 23 2016, 3:58 PM · Antelink loader
zack added a comment to T52: swh-cron: manifest-based scheduler for recurring tasks.
In T52#3925, @olasd wrote:

This is now deployed on moma. Further automation via puppet still needs to be done.

Feb 23 2016, 2:46 PM
olasd added a project to T315: swh-scheduler: add command-line interface: Developers.
Feb 23 2016, 2:46 PM · Scheduling utilities
olasd closed T119: snapshot.debian.org producer as Resolved.

The initial import of the snapshot.debian.org data has been done in december.

Feb 23 2016, 1:39 PM · Origin-Debian, Debian loader
olasd renamed T313: Retrieve fork information for github repositories in swh.lister.github from Retrieve fork information for github repositories in swh.loader.github to Retrieve fork information for github repositories in swh.lister.github.
Feb 23 2016, 1:38 PM · GitHub lister
olasd added a project to T313: Retrieve fork information for github repositories in swh.lister.github: GitHub lister.
Feb 23 2016, 1:38 PM · GitHub lister
olasd closed T52: swh-cron: manifest-based scheduler for recurring tasks as Resolved.
Feb 23 2016, 1:25 PM
olasd added a comment to T52: swh-cron: manifest-based scheduler for recurring tasks.

This is now deployed on moma. Further automation via puppet still needs to be done.

Feb 23 2016, 1:25 PM

Feb 22 2016

olasd added a comment to T313: Retrieve fork information for github repositories in swh.lister.github.

We could probably cheat by using the data from ghtorrent for the repositories that have already been listed, as this is a one-shot job and stale data is more interesting than no data. Worst case scenario: we base our clone on a repo that doesn't exist or doesn't contain the right data, and then we just fallback to regular cloning.

Feb 22 2016, 6:59 PM · GitHub lister
olasd created T313: Retrieve fork information for github repositories in swh.lister.github.
Feb 22 2016, 6:57 PM · GitHub lister
olasd closed T51: smart, all-in-one git cloner/loader/ (+ dealing with updates too), a subtask of T66: clone and load fork GitHub repositories, as Resolved.
Feb 22 2016, 6:31 PM · Restricted Project, General
olasd closed T51: smart, all-in-one git cloner/loader/ (+ dealing with updates too) as Resolved.

A new git updater, based on @ardumont's proof of concept, is now available in rDLDGIT.

Feb 22 2016, 6:31 PM · Git cloner, Git loader
olasd set the image for Developers to F151: fa-users-blue.png.
Feb 22 2016, 3:49 PM
zack renamed T312: Gitorious import: ingest repositories from ingest archived gitorious repositories to ingest gitorious repositories.
Feb 22 2016, 12:37 PM · Archive coverage, Restricted Project, Origin-Gitorious, Format-Git
zack created T312: Gitorious import: ingest repositories.
Feb 22 2016, 12:28 PM · Archive coverage, Restricted Project, Origin-Gitorious, Format-Git

Feb 19 2016

olasd changed the status of T52: swh-cron: manifest-based scheduler for recurring tasks from Open to Work in Progress.

An implementation of this is now available in rDSCH.

Feb 19 2016, 12:52 PM

Feb 17 2016

olasd added a comment to T75: Check integrity of directories, revisions, and releases.
In T75#3503, @olasd wrote:

It currently breaks on *completely* empty messages, but the patch seems fairly simple.

Feb 17 2016, 12:18 PM · Archive content, Restricted Project

Feb 11 2016

olasd added a comment to T52: swh-cron: manifest-based scheduler for recurring tasks.

We did some f2f thinking about this, concentrating on the "origin update" part of the mechanism. The shortcoming of our mechanism is that it's completely specific to updating our origins, and we can do something better...

Feb 11 2016, 9:47 PM

Feb 9 2016

olasd closed T68: support for git tags that point to arbitrary git objects, instead of revisions as Resolved.

This is now supported.

Feb 9 2016, 2:26 PM · Git loader
ardumont renamed T308: retrieve content from s3 and store it in SWH storage from swh-loader-antelink bootstrap - Retrieve content from s3 and store inside swh-storage to Retrieve content from s3 and store inside swh-storage.
Feb 9 2016, 12:29 PM · Storage manager, Antelink loader
ardumont updated the task description for T309: Delete duplicated antelink/antepedia content from s3.
Feb 9 2016, 12:21 PM · Antelink loader
ardumont updated the task description for T308: retrieve content from s3 and store it in SWH storage.
Feb 9 2016, 12:20 PM · Storage manager, Antelink loader
ardumont added a project to T308: retrieve content from s3 and store it in SWH storage: Storage manager.
Feb 9 2016, 12:18 PM · Storage manager, Antelink loader
ardumont added a project to T309: Delete duplicated antelink/antepedia content from s3: Storage manager.
Feb 9 2016, 12:18 PM · Antelink loader
ardumont added projects to T309: Delete duplicated antelink/antepedia content from s3: Antelink loader, Developers.
Feb 9 2016, 12:18 PM · Antelink loader
ardumont added projects to T308: retrieve content from s3 and store it in SWH storage: Antelink loader, Developers.
Feb 9 2016, 12:18 PM · Storage manager, Antelink loader

Feb 8 2016

zack closed T71: update database/storage size estimation using current content of SWH DB as Resolved.

No need of making estimates anymore, as we know know.
After GitHub + Debian snapshot + GNU we have the following:

Feb 8 2016, 4:30 PM · Restricted Project

Feb 4 2016

zack added a project to T240: content archiver: Storage manager.
Feb 4 2016, 3:00 PM · Storage manager
zack created T304: content integrity checker.
Feb 4 2016, 3:00 PM · Storage manager

Jan 29 2016

ardumont closed T299: Return the person's identifier along the person's data as Resolved by committing rDWAPPS4f57e6862526: Returns person's identifier on api + Hide person's emails in views endpoint.
Jan 29 2016, 5:43 PM · Storage manager, Web app
ardumont closed T300: Hide person's email on revision/release/person view as Resolved by committing rDWAPPS4f57e6862526: Returns person's identifier on api + Hide person's emails in views endpoint.
Jan 29 2016, 5:43 PM · Web app
ardumont claimed T299: Return the person's identifier along the person's data.
Jan 29 2016, 3:25 PM · Storage manager, Web app
ardumont closed T296: Try to decode the content's raw data and fail gracefully as Resolved by committing rDWAPPS1c2f64e21ff4: Try to decode the content's raw data and fail gracefully.
Jan 29 2016, 2:27 PM · Web app
ardumont closed T295: Update /browse/directory/<path>/ to show content when path resolves to a content as Resolved by committing rDWAPPS3ec60a52643b: Unify /directory api to Display content's raw data when path resolves to a file.
Jan 29 2016, 12:39 PM · Web app