Event Timeline
$ file emoji-zwj-sequences.txt emoji-zwj-sequences.txt: UTF-8 Unicode text $ ipython In [1]: import base64 In [2]: integrity = "sha256-QhRN0THZ7uIzh2RldFJyfgdP0da0u5Az6GGLbIPfVWg="; base64.decodebytes(integrity.split("-")[1].encode()).hex() Out[2]: '42144dd131d9eee2338764657452727e074fd1d6b4bb9033e8618b6c83df5568' In [3]: integrity = "sha256-0s2mvy1nr2v1x0rr1fxlsv8ly1vyf9978rb4hwry5vnr678ls522"; base64.decodebytes(integrity.split( ...: "-")[1].encode()).hex() Out[3]: 'd2cda6bf2d67af6bf5c74aebd5fc65b2ff25cb5bf27fdf7bf2b6f8870af2e6f9ebebbf25b39db6' $ nix-store --dump emoji-zwj-sequences.txt | sha256sum 8e4da5a445465874d79bd980411320f71385927ff7d767e69ef4ecdf369bafc9 - $ sha256sum emoji-zwj-sequences.txt 98ff05deef36f30bb16d92f1e470f277d412d8f047c7b4b47943bfcbcf0b3097 emoji-zwj-sequences.txt
$ nix-hash --type sha256 --to-base32 $(python3 -c 'import base64; print(base64.b64decode("QhRN0THZ7uIzh2RldFJyfgdP0da0u5Az6GGLbIPfVWg=").hex())') 0s2mvy1nr2v1x0rr1fxlsv8ly1vyf9978rb4hwry5vnr678ls522
Awesome, thx. That means that in that case, the outputHash (base32) and the integrity field (base64) match!
Now remains for me to understand how to check that checksum though...
That still does not match ¯\_(ツ)_/¯:
$ nix-store --dump emoji-zwj-sequences.txt | sha256sum 8e4da5a445465874d79bd980411320f71385927ff7d767e69ef4ecdf369bafc9 - $ nix-hash --type sha256 emoji-zwj-sequences.txt 8e4da5a445465874d79bd980411320f71385927ff7d767e69ef4ecdf369bafc9 $ ipython ... In [2]: integrity = "sha256-QhRN0THZ7uIzh2RldFJyfgdP0da0u5Az6GGLbIPfVWg="; base64.decodebytes(integrity.split("-")[1].encode()).hex() vvOut[2]: '42144dd131d9eee2338764657452727e074fd1d6b4bb9033e8618b6c83df5568'
I figured it out. Reading the source, I noticed that file is somehow downloaded to share/unicode/emoji. So if I reproduce this FS layout before turning it into a NAR, I can reproduce the hash in the manifest:
$ mkdir -p foobar/share/unicode/emoji $ wget https://www.unicode.org/Public/emoji/12.1/emoji-zwj-sequences.txt -O foobar/share/unicode/emoji/emoji-zwj-sequences.txt -q $ nix-store --dump foobar | sha256sum 42144dd131d9eee2338764657452727e074fd1d6b4bb9033e8618b6c83df5568 - $ python3 >>> import base64 >>> base64.b64encode(bytes.fromhex("42144dd131d9eee2338764657452727e074fd1d6b4bb9033e8618b6c83df5568")) b'QhRN0THZ7uIzh2RldFJyfgdP0da0u5Az6GGLbIPfVWg='
Nice catch! I did not realize the fs layout when reading this derivation.
So conclusion, as we don't have the fs layout from the nixpkgs manifest, we cannot do anything about this case.
The loader will simply "hash mismatch" on it and the origin will fail the ingestion.
So we must notify upstream about the missing information for those origins in the manifest and ignore those origins from the listing in the mean time [1]
[1] T4608