Page MenuHomeSoftware Heritage

origin_head: Do not fetch complete snapshots for non-FTP visits
ClosedPublic

Authored by vlorentz on Mon, Nov 21, 1:46 PM.

Details

Summary

Some snapshots are really large. Rather than fetching them entirely only to
discard most of the branches, this commit only fetches some branches (to
check existence + to use less queries on small snapshots), then requests
specific branches as needed (usually only 2).

This should improve performance and reduce timeout exceptions from the
storage.

Diff Detail

Repository
rDCIDX Metadata indexer
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8861 (id=31937)

Rebasing onto b7f04dd9d4...

Current branch diff-target is up to date.
Changes applied before test
commit 03b4bb002c87e1b124edfb5e12ad09f04f3d99dd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 13:38:26 2022 +0100

    origin_head: Do not fetch complete snapshots for non-FTP visits
    
    Some snapshots are really large. Rather than fetching them entirely only to
    discard most of the branches, this commit only fetches some branches (to
    check existence + to use less queries on small snapshots), then requests
    specific branches as needed (usually only 2).
    
    This should improve performance and reduce timeout exceptions from the
    storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/525/ for more details.

This revision is now accepted and ready to land.Mon, Nov 21, 2:47 PM