HomeSoftware Heritage

browse/utils: Robustify content encoding detection

This commit no longer exists in the repository. It may have been part of a branch which was deleted.

Description

browse/utils: Robustify content encoding detection

When attempting to re-encode non UTF-8 textual content, use chardet
to find the encoding first and use it if the detection confidence
is really high.

Previously some encoding like SHIFT_JIS (for japanese language) were
not correctly detected and thus content were badly rendered in the
browse Web UI.

Details

Provenance
anlambertAuthored on Feb 17 2022, 4:57 PM
anlambertPushed on Feb 18 2022, 11:06 AM
Differential Revision
D7197: browse/utils: Robustify content encoding detection
Build Status
Buildable 26978
Build 42182: test-and-buildJenkins console · Jenkins

Commit No Longer Exists

This commit no longer exists in the repository.