Page MenuHomeSoftware Heritage

swh-loader-tar generates dangling releases
Closed, MigratedEdits Locked


The releases generated by swh-loader-tar aren't referenced by any occurrences: we generate occurrences pointing at revision objects, but not at releases.


  • stop generating dangling releases and keep the occurrences pointing at revisions (source code adaptation)
  • Remove dangling gnu releases (generated by the tarball loader when ingesting gnu tarballs)

Event Timeline

zack moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Feb 16 2017, 9:14 AM

After discussion, we will stop generating dangling releases and keep the occurrences pointing at revisions.

Yep. Of course we will also need to remove all dangling releases that have been generated thus far.

Of course we will also need to remove all dangling releases that have been generated thus far.


ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)

Actions undertook to clean dangling releases:

  1. Identifying the dangling releases' author ('swh-robot')
$ select id from person where name='Software Heritage' and fullname='Software Heritage <>' and email='';
> 3661419
  1. All origins coming from gnu are of type 'ftp'
$ select * from origin where type='ftp';
   id    | type |                                   url                                   | lister | project
 4423668 | ftp  | rsync://                                           |        |
 4423671 | ftp  | rsync://                                           |        |
 4423974 | ftp  | rsync://                          |        |

Checking manually those are indeed only gnu (there aren't that much).

All dangling releases that were created targets the same revision as the occurrence that were created at the same time:

$ select count( from occurrence occ inner join origin ori on inner join release r on ( and r.target_type='revision') where ori.type='ftp' and;
  1. Thus the deletion step
$ delete from release where id in (select from occurrence occ inner join origin ori on inner join release r on ( and r.target_type='revision') where ori.type='ftp' and;
  1. Checking for releases with the 'swh-robot' author no longer shows any releases
softwareheritage=> select count(*) from release r where and r.synthetic;
(1 row)