Page MenuHomeSoftware Heritage

npm: Add workaround for mangled package descriptions
ClosedPublic

Authored by vlorentz on Jun 15 2022, 6:30 PM.

Details

Summary

Null bytes in JSON produced by indexers cause the indexer-storage to crash (T4277),
and this case seems to be the only current source of such crashes;
so this should fix the issue for now.

A future commit will sanitize all JSON documents before storage.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 29885
Build 46717: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 46716: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D7992 (id=28796)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 026e9fb86c138df9f62473a0a03cf1cf9fd1617b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/255/ for more details.

Build has FAILED

Patch application report for D7992 (id=28802)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 92fc18eae4c239e0c5811060d706986ea54063f7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/256/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/256/console

Build has FAILED

Patch application report for D7992 (id=28803)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 4c1a829a6b3331c89a0af1c1234b58e271c25b52
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/257/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/257/console

Build is green

Patch application report for D7992 (id=28804)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 38bd1f45cda2599730821b07d446c9c37f0d8af1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/258/ for more details.

ardumont added a subscriber: ardumont.

lgtm, one remark inline.

swh/indexer/metadata_dictionary/npm.py
183

you're missing that one in your docstring sample tests.

This revision is now accepted and ready to land.Jun 16 2022, 10:31 AM

Build is green

Patch application report for D7992 (id=28808)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 62db0cb2086be7a4c6c48f6488fb425485e56093
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/259/ for more details.