We want to be able to list all available packages on npm in order to load their content into the archive.
Description
Description
Revisions and Commits
Revisions and Commits
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T1378 Ingest npm into the Software Heritage archive (meta task) | ||
Migrated | gitlab-migration | T1380 npm lister |
Event Timeline
Comment Actions
The npm registry is a CouchDB database located at https://replicate.npmjs.com.
The following endpoint enables to list all registered packages: https://replicate.npmjs.com/_all_docs?limit=100
We should be able to use the SWHIndexingHttpLister [1] using the recommended CouchDB pagination method [2]
[2] http://docs.couchdb.org/en/stable/ddocs/views/pagination.html#paging-alternate-method