Page MenuHomeSoftware Heritage

index project licenses using GitHub's license detector
Open, NormalPublic


Background: we're currently indexing licenses at the individual blob level, using fossology nomossa. With the recent work done on metadata indexing by @vlorentz , which shows among other things the feasibility of quickly indexing the most recently visited snapshot of all our origins, we can complement file-level license indexing with project-level license indexing. (We already do some of it, for projects who declare licenses in metadata files, but that leaves out a lot of projects.)

As a start we can use GitHub's licensee and run it on suitable revisions, associating its result to dir/commits/snapshots (whatever is appropriate), similarly to what we do for metadata.

Related Objects

Event Timeline

zack triaged this task as Normal priority.Feb 2 2019, 1:34 PM
zack created this task.

P353 is a quick and dirty wrapper to play with licensee locally after git clone