This is a long-standing and well-known issue, but I don't think a task was open about it yet.
When ingesting an origin, some nodes of the DAG may be missing, for various reasons:
* corrupted data (eg. a commit in the git history does not match its hash)
* directory must be found "somewhere else" (eg. SVN external (T611)
* revisions must be found "somewhere else" (eg. Bazaar stacked branches)
* ingestion of a (potentially large) repo might stop/crash after having ingested only some of its objects, and the repository might have disappeared when we try again
Currently, what happens is:
* if the missing object is a git object, then we know its sha1_git, and it's just a dangling reference (though this will be an issue when we will want to implement generation numbers, T1617)
* even in this (fortunate) case, other objects transitively referenced might remain completely unknown
* otherwise, objects referencing the missing object cannot even be represented in the SWH data model (and recursively, all objects referencing it)