Page MenuHomeSoftware Heritage

Add OpenID Connect autentication backend and login/logout views
ClosedPublic

Authored by anlambert on Mar 2 2020, 6:43 PM.

Details

Summary

First diff related to users authentication in swh-web based on OpenID Connect (implemented in Keycloak).

The scope of that diff is the authentication of users from the main HTML interface of swh-web.
This could be used to enable special GUI features (for instance admin pages) based on user group
or permissions (currently not handled, will be implemented in T2295 and T2247).

Users get authenticated by using the OIDC Authorization Code Flow with PKCE extension,
based on the use of a temporary dynamic client secret instead of storing a static one
on the backend side.

The main additions in that diff are located in the new swh.web.auth module containing the following files:

  • backends.py: add Django plumbing to use OpenID Connect authentication layer
  • models.py: a custom Django User model used for remote users in order to store OpenID Connect related data and avoid to save users to Django database (those sensitive information are already securely stored in *Keycloak* so there is no need to duplicate them).
  • views.py: new Django views in order for users to login/logout from the main HTML interface.

Previous admin login based on standard Django login features is still available.

Related to T2245
Related to T2246

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
swh/web/auth/backends.py
87

I don't think it's a good idea to raise any exception as-is and show it to users like this; the message might contain secrets.

The code should also make the difference between an error (log it) and an authentication failure (just return None).

swh/web/auth/utils.py
36–38

Just use code_verifier = secrets.token_urlsafe(60); code_verifier_str=code_verifier.decode('ascii'). It's shorter and guarantees you don't lose bytes of entropy in the substitution.

42

no need for utf8, you can use .encode('ascii')

(or directly code_verifier with the change mentionned above)

swh/web/auth/views.py
31

Wrong name; this is not a nonce.

35

I'd rename the parameter next_path in the input too, so it's clearer what it is

70

it's very unclear what "state" means when not reading this function. Could you rename it to csrf_token?

80

You need to remove nonce['code_verifier'] from the _auth_exception dict defined in swh/web/auth/backends.py, or the dict will grow forever.

swh/web/templates/layout.html
92

meh

swh/web/tests/auth/keycloak_mock.py
26

well_known

swh/web/tests/auth/test_backends.py
18

from . import sample_data, so it's clearer in the code where these variables come from

swh/web/tests/auth/test_views.py
42

Use urllib.parse

47–68

Use dict equality. pytest rewrites assert statements to show a nice diff in case they don't match.

82

Where is session_state used?

125

no parentheses

149–220

You should split this test into different tests after each assert isinstance(request.user, AnonymousUser).

And you're missing a test for invalid challenge code.

This revision now requires changes to proceed.Mar 3 2020, 6:14 PM

Thanks for the review @vlorentz. I got some work to do now ;-)

swh/web/auth/backends.py
14

Because Django authentication backends should return None when authentication failed (see https://docs.djangoproject.com/fr/2.2/topics/auth/customizing/#writing-an-authentication-backend)

87

Ack, I got what you mean. Let sentry capture the exception and simply inform the user authentication has failed when authenticate returns None. We can get rid of the get_auth_exception function this way.

swh/web/auth/keycloak.py
60–62

This only works if auth_url already has query parameters. Is this guaranteed by self._keycloak.auth_url?

Hopefully yes, the redirect_uri plus other parameters will be inserted as query parameters by self._keycloak.auth_url

110–111

In practice, the certs list contains a single element. Anyway, there is an alternative way to get the realm public key:

realm_public_key = self._keycloak.public_key()
self._realm_public_key = '-----BEGIN PUBLIC KEY-----\n'
self._realm_public_key += realm_public_key
self._realm_public_key += '\n-----END PUBLIC KEY-----'

I would go for that approach instead.

138

Ack

155

Using a tuple key seems a better choice indeed.

swh/web/auth/models.py
27–40

Ack, will improve.

42–49

Did not think of checking the user.backend value, much better indeed.

swh/web/auth/utils.py
36–38

Oh I did not known that Python standard library has a secrets module. Thanks !

42

ack

49

This is for the case if we declare multiple clients in the realm to be used from swh-web.

swh/web/auth/views.py
31

Indeed, state is the nonce here

35

ack

70

I prefer to keep the OpenID Connect parameter name.

71

Ack

80

I will remove the _auth_exception dict instead.

swh/web/common/utils.py
351

Ack

swh/web/config.py
114–115

I prefer empty strings to give a hint about the expected type.

swh/web/templates/layout.html
92
{% if 'OIDC' in user.backend %}
swh/web/templates/logout.html
21–26

Nope, I forgot to modify that template.

swh/web/tests/auth/keycloak_mock.py
26
swh/web/tests/auth/test_backends.py
18

Ack

swh/web/tests/auth/test_utils.py
22–23

I think it should be assert re.match(r'^[a-zA-Z0-9-\._~]+$', code_verifier) instead

swh/web/tests/auth/test_views.py
42

ack

47–68

I can't as nonce and query_dict do not have the same content.

82

I simply want to produce the same type of redirect url Keycloak generates. The session_state value is intended to be used to check OpenID connect session from the frontend side (see https://openid.net/specs/openid-connect-session-1_0.html#ChangeNotification). I do not intend to implement it and will do that check using a Django middleware instead. But who knows, we may need to implement the frontend check in the future.

95

ack

125

whoopsie

149–220

Ack, will improve thoses tests.

anlambert marked 28 inline comments as done.

Update:

  • address @vlorentz comments
  • set user.is_staff value based on keycloak user group (T2295)
swh/web/auth/views.py
35

Will go for next, shorter is better

Update: Fix failing test (missing keycloak conf in swh.web.settings.tests)

Thanks! A few more comments:

swh/web/auth/backends.py
26

is userinfo['groups']a list/set of strings, or is it a string?

swh/web/auth/views.py
35

I disagree, "next" is not explicit (it could be a path, a view name, an action, ...). I prefer "next_path".

70

Meh. Then could you add a comment explaining it?

swh/web/templates/layout.html
92

much better :)

swh/web/tests/auth/test_utils.py
22–23

It's equivalent, re.match matches from the beginning to the end; unlike re.find which only finds an occurrence in the string (hence their respective names)

swh/web/tests/auth/test_views.py
47–68

hmm, right. But you can do a set equality of the keys whose value is unknown

This revision now requires changes to proceed.Mar 5 2020, 12:35 PM
anlambert marked 4 inline comments as done.

Update:

  • address new @vlorentz comments
  • after testing in production like environment using docker (D2774), persist user related data to django cache (backed by memcached in production) in order for any gunicorn worker to be able to recreate a logged user when it handles an input HTTP request
  • slight reorganization of backends module code
swh/web/auth/backends.py
26

It is a list of strings.

swh/web/auth/views.py
35

Allright

70

ack

swh/web/tests/auth/test_utils.py
22–23

Ack, I should have read the doc.

swh/web/tests/auth/test_views.py
47–68

login_data and query_dict only share the state and redirect_uri keys so it does not worth it.

This revision is now accepted and ready to land.Mar 6 2020, 1:10 AM

Update: Fix missing exception handling in OIDCAuthorizationCodePKCEBackend.get_user.

For instance access token may have expired when a new HTTP request is handled (silent relogin
will be implemented in T2267 to mitigate that issue).