Page MenuHomeSoftware Heritage

web app: return (best-guess) Content-Type on /content/raw/ to ease in-browser rendering
Closed, MigratedEdits Locked

Description

When viewing HTML contents under the /content/raw/ endpoint of the web app, we currently return text/plain as MIME type in the HTTP response. See, e.g.:

$ curl -s --head https://archive.softwareheritage.org/browse/content/sha256:1d727fc911579e5ec2d6cd463189cdb76bf7ef6c8f487f1a0701a1ec06aabdf1/raw/ | grep -i Content-Type
Content-Type: text/plain

This means we currently have no way of allowing users to render HTML pages in-browser, short of downloading them and open them locally --- which is not very practical.

Rendering the HTML within the webapp is not terribly safe, so that's probably out of question.

On the other hand, returning a best-guess Content-Type when browsing content via /content/raw might be an option?
And probably we can just reuse the libmagic-detected MIME type that we already use for other purposes?

(thanks to Konrad Hinsen for this feature request)

Event Timeline

zack triaged this task as Wishlist priority.Aug 25 2018, 4:23 PM
zack created this task.
anlambert claimed this task.
anlambert added a subscriber: anlambert.

Closing this as from my point of view, the browse/content/raw endpoint should only have text/plain or application/octet-stream content type.

Every code hosting solution like GitHub or GitLab is proceeding like this so the current behavior is the right one.

All specific content rendering should be handlled in the browse/content endpoint instead.