Page MenuHomeSoftware Heritage

github: Improve credential expiration policy
AbandonedPublic

Authored by ardumont on Nov 25 2019, 11:45 AM.

Details

Reviewers
olasd
douardda
Group Reviewers
Reviewers
Summary

When a credential is reported as "bad", consider it expired. Remove it from the
current worker credentials config. Then, try immediately with another
credential.

Related T2099

Test Plan

tox

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9353
Build 13697: tox-on-jenkinsJenkins
Build 13696: arc lint + arc unit

Event Timeline

swh/lister/core/lister_transports.py
93

type: ignore because otherwise it complains about the config not being declared for that class...
which is also true...
but...
let's refactor this another day.

(same goes for the other type: ignore below)

swh/lister/github/lister.py
55

Another implem' could be to override the default request_params method and set the current auth tuple (self.auth = auth or something) used...

I'm not sure what's cleaner, clearer, etc...

Fix authorization decoding computation function

Add test around credential_used function

Update according to discussion

There exists other authentication schemes.

So now the implementation relies on knowing the authentication auth tuple. The
github lister now removes it when it's deemed expired. And then immediately
queries again the listing with another credential.

douardda added a subscriber: douardda.

Am I reading this wrong, or is the new behavior implemented in this diff not really tested?

swh/lister/core/lister_transports.py
137

I don't really understand the semantic of 'reset' in the method (at first glance).

Why this 'reset' does remove a token? Remove from what? Why/when is this useful or needed?

I mean credentials are loaded by the 'request_instance_credentials' method. Now this method (I guess called when a credential usage generated errors ; not yet read the remaining of the code), will gradually remove entries from the list of known credentials. So each time an error occurs this list will be pop'd.

Not sure I like this behavior very much. It might be better to be able to temporarily disable a credential. Then check on a regular base if the credential can be enabled again.

Also, the credential 'status' is kept local to the class. Whereas we have many workers. It might be useful to think about a solution for this state to be shared among workers (using memcache, etcd or similar). Not to be done right now in this diff, obviously :D

swh/lister/github/lister.py
44

No need for this function to be inlined here (does not capture the closure or similar). So no need to instantiate a new function object on each call of this method...

This revision now requires changes to proceed.Jan 9 2020, 10:54 AM

Am I reading this wrong, or is the new behavior implemented in this diff not really tested?

Indeed.

I wanted to have some kind of discussion about how to implement somewhat properly what's described in T2099 first.
(prior to fight the current tests to test this i mean)


context:

Currently, the github listers are hitting rate limit.
Other listers are waiting on the lister github tasks to finish (which kinda never do because it's stuck in a "infinite" waiting time loop...).
By adding this behavior, that could make them finally fail (when no more credentials).

This implicitely lowered in priority as we unstuck the other listers.
First, by stopping one worker (changing its setup so it stops listing github).
Then by completely removing the github listing tasks (as the consensus was, we need more accounts, which afaiu @olasd is doing)

swh/lister/core/lister_transports.py
137

Why this 'reset' does remove a token?

yes, remove_authentication_token sounds better now.

Remove from what?

From the current setup initially loaded (from the configuration file) at the task startup.

Why/when is this useful or needed?

That'd be the task linked in the description T2099.

swh/lister/github/lister.py
44

do you mean this can be set as class method directly?

(my main concern was to avoid repetition of this code in the conditional below).