Page MenuHomeSoftware Heritage

Google Code Git import: Examine ingestion logs for errors and list them if any
Closed, ResolvedPublic

Description

When the ingestion is done, retrieve errors immediately from softwareheritage-log between the start and end date of the ingestion and reference them (e.g paste referenced from that thread or directly as comment here).

The errors should be addressable as per git repository.

Event Timeline

olasd renamed this task from Reference errors after ingestion to Google Code Git import: Reference errors after ingestion.Feb 10 2017, 2:38 PM
ardumont renamed this task from Google Code Git import: Reference errors after ingestion to Google Code Git import: Examine ingestion logs for errors and list them if any.Feb 10 2017, 3:18 PM

After much learning on how to read and extract logs from our kibana instance, here is the error repartition.

"googlecode": {
  "total": 2810,
  "errors": {
    "OSError(28, 'No space left on device')": 1324,
    "NotGitRepository('No git repository was found at /": 804,
    "ValueError('Failed to uncompress archive /srv/stor": 275,
    "IntegrityError('duplicate key value violates uniqu": 152,
    "FileNotFoundError(2, \"No such file or directory: '": 126,
    "FileNotFoundError(2, \"No usable temporary director": 94,
    "StorageAPIError(ConnectionError(ProtocolError('Con": 35
  }
}

It's either that:

  • we send something that was not a git repository.
  • archive were not successfully uncompressed
  • integrity error (which is expected for now)
  • disk space problem

Note that we have less errors than previously mentioned since i cover a lesser period (multiple schedulings took place).

source:

After rescheduling of thos origins (the one we can do something about), here are the remaining errors.

"googlecode": {
  "errors": {
    "NotGitRepository('No git repository was found at /": 136
  },
  "total": 136
}

Lists of origin with integrity errors and origins that are not git repositories has been computed and stored in uffizi:/srv/storage/space/lists/loader-git-disk/origin-with-errors.googlecode.tar.gz.