Page MenuHomeSoftware Heritage

tarball: properly normalize perms for all extracted files
ClosedPublic

Authored by olasd on Apr 26 2021, 6:10 PM.

Details

Summary

The previous implementation would not normalize permissions for
files detected as executable. This would make the loader crash when a
file has bogus permissions such as 0o100100 (executable but not
readable).

This adds a test that all possible file permissions properly normalize
to either 0o100644 or 0o100755.

Addresses sentry issue: https://sentry.softwareheritage.org/share/issue/062198271b4d40feab41330f12ec4dc2/
(yes, I've seen T3244, but this looked like a low enough hanging fruit...)

Test Plan

new test added.

Diff Detail

Event Timeline

Build is green

Patch application report for D5614 (id=20024)

Rebasing onto df31524e35...

Current branch diff-target is up to date.
Changes applied before test
commit 7d42035a13a29513c250f7323f7ff17811c74014
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Apr 26 17:57:38 2021 +0200

    tarball: properly normalize perms for all extracted files
    
    The previous implementation would not normalize permissions for
    files detected as executable. This would make the loader crash when a
    file has bogus permissions such as 0o100100 (executable but not
    readable).
    
    This adds a test that all possible file permissions properly normalize
    to either 0o100644 or 0o100755.

See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/219/ for more details.

olasd requested review of this revision.Apr 26 2021, 6:11 PM

Why only 0o100644 and 0o100755? I don't think we should make the package loaders discard this information just because Git does.

Why only 0o100644 and 0o100755? I don't think we should make the package loaders discard this information just because Git does.

This is a (very valid) question for the v2 model, I'd say, not for this fix :-)

This revision is now accepted and ready to land.Apr 27 2021, 10:16 AM

@douardda I don't understand your comment. The v1 model explicitly allows types other than git-normalized ones: https://forge.softwareheritage.org/source/swh-model/browse/master/swh/model/model.py$594-595

If we want to remove perms normalization from the tarball loader, then we need to discuss that in a separate task. This fixes and introduces tests for the behavior that was intended by the original code, which was buggy.