Page MenuHomeSoftware Heritage

swh-web: Fix XSS
ClosedPublic

Authored by kalpitk on May 4 2019, 5:09 PM.

Details

Summary

Fix XSS vulnerabilities in swh-web

Related T1699

Diff Detail

Repository
rDWAPPS Web applications
Branch
arcpatch-D1442
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 5637
Build 7696: tox-on-jenkinsJenkins
Build 7695: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.May 6 2019, 10:43 AM

Could you add one (or more) unit test for this?

This revision now requires changes to proceed.May 6 2019, 10:44 AM

My main issue on that change is that the generated links will not be available anymore in the displayed error message, see output below after your change on that example (https://archive.softwareheritage.org/browse/revision/041f5ea1cf987a4068ef5f39ba0a09be85952064/?origin=https://github.com/git/got):

swh.web.common.exc.NotFoundExc: The Software Heritage archive has a revision with the hash you provided but the origin mentioned in your request appears broken: <a href="https://github.com/git/got">https://github.com/git/got</a>. Please check the URL and try again.

Nevertheless, you can still browse the revision without origin information: <a href="/browse/revision/041f5ea1cf987a4068ef5f39ba0a09be85952064/">/browse/revision/041f5ea1cf987a4068ef5f39ba0a09be85952064/</a>

This makes the navigation more painful as the user will have to copy / paste the link instead of simply clicking on it.

Considering the only vulnerability here is if someone submits an origin url through a save code now request containing <script> tags in it, I think it will be better to filter out that kind
of bogus data directly in the common/origin_save.py file.

@anlambert
Ok. Will need to rework this, to keep the links working.
But,
1- if we filter out directly while saving, it may cause some valid links being rejected.
2- if I use filterXSS, other html tags will still work.

@kalpitk , indeed this is not so easy as it seems to fix any possible XSS injections.

1- if we filter out directly while saving, it may cause some valid links being rejected.

Yes, this calls for clever filtering server-side on the submitted origin urls to save into the archive.

2- if I use filterXSS, other html tags will still work.

Maybe but at least the harmful tags will be removed so this is not really a big deal IMHO.

We should really add some end to end tests using Selenium in a near future to better detect
and prevent XSS vulnerabilities.

  • use mark_safe with escape

Looks good. Please address my comment and I will accept the diff.

swh/web/common/exc.py
120

Please move the call to mark_safe back into the _generate_error_page function.

This revision now requires changes to proceed.May 7 2019, 10:56 AM
  • move mark_safe to _generate_error_page

@anlambert We need to remember to use 'escape' unsafe things, whenever marking some html as safe.

I think, I'll add selenium end-to-end tests in another commit (as per Timeline in my GSoC proposal?)

rebase and squash commits

  • Fix XSS in origin/search

@anlambert We need to remember to use 'escape' unsafe things, whenever marking some html as safe.

Indeed, can you create a task on the forge to track this work ?

I think, I'll add selenium end-to-end tests in another commit (as per Timeline in my GSoC proposal?)

Of course, adding end-to-end tests should be handled in a separate diff and a task should also be created on the forge (maybe I already created one, need to check that).

anlambert added inline comments.
swh/web/browse/utils.py
866

This url must also be escaped.

This revision now requires changes to proceed.May 7 2019, 1:22 PM

@kalpitk , I found other XSS vulnerabilities related to error pages. In the file common/exc.py, request.META['PATH_INFO'] values must also be escaped (see https://forge.softwareheritage.org/source/swh-web/browse/master/swh/web/common/exc.py$71 as an example).

Also, every message set to the following exception types: NotFoundExc, BadInpuExc, ForbiddenExc in the browse code should be checked for url escaping (see https://forge.softwareheritage.org/source/swh-web/browse/master/swh/web/browse/views/utils/snapshot_context.py$80-120 for instance)

Can you add these extra url escaping to that diff ?

  • Add escaping for BadInputExc
  • Add escape for request.META

@anlambert I think, I have added escape wherever it was vulnerable.

checksums and ids and plain strings can't have xss vulnerability.

swh/web/browse/views/utils/snapshot_context.py
121

Only the url should be escaped here, not the whole message

swh/web/browse/views/utils/snapshot_context.py
121

Since we don't have any hyperlinks, we can escape the whole message.

And shouldn't branch and branch_type also be escaped in that case?

anlambert requested changes to this revision.EditedMay 7 2019, 3:07 PM

@kalpitk , found another issue in the web api HTML interface: http://localhost:5004/api/1/origin/git/url/%3Cscript%3Ealert(document.domain)%3C/script%3E/

You should fix that by hacking on api/apiresponse.py

swh/web/browse/views/utils/snapshot_context.py
121

You're right, I forgot these could be specified as url arguments too.
Forget my last comment.

This revision now requires changes to proceed.May 7 2019, 3:07 PM
  • Fix XSS in web API interface
swh/web/api/apiresponse.py
180

Here I would escape the error string only when the request media type is text/html as JSON error reponses will look strange otherwise.
You can use the following test to detect the media type:

if request.accepted_media_type == 'text/html':
    error_data['reason'] = escape(error_data['reason'])
This revision now requires changes to proceed.May 7 2019, 3:51 PM
swh/web/api/apiresponse.py
180

This is why the tests are failing by the way: https://jenkins.softwareheritage.org/job/DWAPPS/job/tox/429/console

swh/web/browse/utils.py
493

Can you remove the escaping of attrs here. It breaks some link styles and as the value of this parameter is provided from string literals in the code, we can consider it safe.

@kalpitk , I found where the XSS vulnerability is located when a branch name contains some HTML code in it.

In the file templates/includes/top-navigation.html, you need to replace:

var snapshotContext = false;
var branch = false;
{% if snapshot_context %}
  snapshotContext = true;
  branch = {{ snapshot_context.branch|jsonify }};
{% endif %}
swh.browse.initSnapshotNavigation(snapshotContext, branch);

by

var snapshotContext = false;
var branch = false;
{% if snapshot_context %}
  snapshotContext = true;
  branch = "{{ snapshot_context.branch|escape }}";
{% endif %}
swh.browse.initSnapshotNavigation(snapshotContext, branch !== "None");
  • fix failing tests and make minor changes

@kalpitk , I think we fixed most of the XSS vulnerabilities so I am ready to accept the diff in its current state.
Before that, I need you to do the following:

  • address my latest comments
  • perform an interactive rebase in order to:
    • remove the useless merge commit in the diff (please rebase instead of pulling the changes from master when working on a diff)
    • squash the remaining commits into a single one
    • set the commit message to something like
swh-web: Fix numerous XSS vulnerabilities

detail the issues fixed in the commit message body then terminate the commit message with that line

Closes T1699
  • rebase on origin/master before updating the diff
swh/web/api/apiresponse.py
180

You need to remove the escaping here

swh/web/tests/api/views/test_content.py
139 ↗(On Diff #4741)

You can revert this as no escaping should happen when calling the api programmatically.

This revision now requires changes to proceed.May 7 2019, 6:06 PM

Thanks for the deep testing for XSS ! Let's land this. You should be able to push your changes to the master branch using git directly (use the Phabricator web interface otherwise).

This revision is now accepted and ready to land.May 7 2019, 7:56 PM
This revision was automatically updated to reflect the committed changes.