Page MenuHomeSoftware Heritage

ingest the Codeplex archive
Open, NormalPublic

Description

The former Codeplex forge is now archived. We should crawl the archive (one-shot) and ingest all of its content into the Software Heritage archive.

A full list of projects can be obtained reading the sitemap file (while this doesn't look great, it has been guaranteed to be a valid approach in this case by the Codeplex archive admins). A copy of the map is also here:

The .zip files that can be download for each project seems to be structured and contain the git repo, discussions, issues, releases, etc. I haven't yet found a proper spec for the file format, but I'll investigate more.

Event Timeline

zack triaged this task as Normal priority.Apr 3 2019, 10:04 AM
zack created this task.