Page MenuHomeSoftware Heritage

upload-based content search
Closed, MigratedEdits Locked

Description

Right now we "ask" users to compute the checksums themselves, in order to be able to check if we have some content.
We should also allow users to upload their own files (within reasonable limits…) and verify if we have them or not.

Event Timeline

ardumont claimed this task.
ardumont raised the priority of this task from to Normal.
ardumont updated the task description. (Show Details)
ardumont added projects: Staff, Developers.
ardumont added a subscriber: ardumont.
zack renamed this task from Upload a file, computes its sha1 hash and sha256 hash and check its existence in swh's storage to upload-based content search.Oct 28 2015, 6:58 PM
zack updated the task description. (Show Details)

For now, upload the file, compute its sha1, dismiss the content, check if sha1 in swh storage exists.
Answers the query.

Sample:

$ curl -X POST -F filename=@/path/to/file http://localhost:6543/api/1/uploadnsearch/
{
    "found": false,
    "sha1": "e95097ad2d607b4c89c1ce7ca1fef2a1e4450558"
}%

(code 200)

Regarding the limit, olasd and i spoke of apache's settings for this.

In the mean time, it's flask which limits it.
This is a parameter in swh-web-ui's of 16Mb for now.
(Throws a 413 entity too large if too much -> tested with a tarball of 21Mb)

olasd changed the visibility from "All Users" to "Public (No Login Required)".May 13 2016, 5:05 PM