Page MenuHomeSoftware Heritage

npm: Add workaround for mangled package descriptions
ClosedPublic

Authored by vlorentz on Jun 15 2022, 6:30 PM.

Details

Summary

Null bytes in JSON produced by indexers cause the indexer-storage to crash (T4277),
and this case seems to be the only current source of such crashes;
so this should fix the issue for now.

A future commit will sanitize all JSON documents before storage.

Diff Detail

Repository
rDCIDX Metadata indexer
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D7992 (id=28796)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 026e9fb86c138df9f62473a0a03cf1cf9fd1617b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/255/ for more details.

Build has FAILED

Patch application report for D7992 (id=28802)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 92fc18eae4c239e0c5811060d706986ea54063f7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/256/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/256/console

Build has FAILED

Patch application report for D7992 (id=28803)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 4c1a829a6b3331c89a0af1c1234b58e271c25b52
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/257/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/257/console

Build is green

Patch application report for D7992 (id=28804)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 38bd1f45cda2599730821b07d446c9c37f0d8af1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/258/ for more details.

ardumont added a subscriber: ardumont.

lgtm, one remark inline.

swh/indexer/metadata_dictionary/npm.py
183

you're missing that one in your docstring sample tests.

This revision is now accepted and ready to land.Jun 16 2022, 10:31 AM

Build is green

Patch application report for D7992 (id=28808)

Rebasing onto 710467138a...

Current branch diff-target is up to date.
Changes applied before test
commit 62db0cb2086be7a4c6c48f6488fb425485e56093
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 15 18:29:09 2022 +0200

    npm: Add workaround for mangled package descriptions
    
    Null bytes in JSON produced by indexers cause the indexer-storage to crash,
    and this case seems to be the only current source of such crashes;
    so this should fix the issue for now.
    
    A future commit will sanitize all JSON documents before storage.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/259/ for more details.