Page MenuHomeSoftware Heritage

Web API: add back persistent and shared backend for rate limit counters
Closed, MigratedEdits Locked

Description

Right now, the new archive.softwareheritage.org running django has been deployed with the rate limit backend as an in-memory one (default is LocMemCache).

This will pose a problem regarding the rate limit being shared between all gunicorn workers (which should not).

Prior to this deployment, we used redis (when using Flask_Limiter).
We should be running an equivalent system to avoid such shortcomings.

According to djangorestframework/django, we could use memcached.

Discussion:

14:18:34 ardumont | what do i do with the redis instance that flask limiter used?
14:49:02     zack | where are the live counters stored in the new implementation?
14:49:08     zack | anyway, if it's no longer needed, it should go away
14:50:29 ardumont | i don't think it's stored anymore (NDA: I meant not persistent :)
14:50:32 ardumont | and not in redis anyway
14:50:39 ardumont | anlambert: do you confirm?
14:50:48 ardumont | other than that i'm ready to deploy
14:50:54 ardumont | i fixed the tiny errors in the manifest
14:52:18    olasd | it uses django's caching mechanism
14:55:29    olasd | so, by default, a local memory cache
14:58:31    olasd | which means that the rate limit would be spread out across workers
14:58:57    olasd | which is fine for a first deployment but needs to be fixed eventually
15:00:07 ardumont | you mean the different gunicorn workers?
15:00:10    olasd | yes
15:00:47    olasd | each gunicorn worker would get its own separate counter
15:01:20 ardumont | ack, thx (i was reading the doc, for info http://www.django-rest-framework.org/api-guide/throttling/#setting-up-the-cache)
15:01:31 ardumont | (not for you olasd :)
15:02:58     zack | so, maybe, before throwing away redis, check if it is still a viable backend?
15:04:01    olasd | stretch and up have python3-django-redis
15:04:04    olasd | not jessie
15:04:22 ardumont | ah, nice to know
15:04:51 ardumont | i won't throw away redis yet :)
15:04:52    olasd | but I'm not married to redis
15:06:04 ardumont | zack: it's not clear if it is a blocker to deploy (as olasd said, we can fix it later, as in next week for example)
15:06:18     zack | i didn't say it was a blocker :)
15:06:26 ardumont | ok, thx for clarifying :)
15:07:14    olasd | rate limiting is the only reason for redis to be deployed on our infra, we can just as well replace it with memcached if that's what django prefers
15:07:24    olasd | (flask-limiter also supported memcached)
15:07:37 ardumont | yes, and memcached is in the default list i refer to
15:08:07 ardumont | (well, i'd need to check for the version though, we're still in 1.7 in oldstable :)
15:08:19 ardumont | and the link is 1.11
15:08:45 ardumont | (it is present in 1.7 :)

Related Objects

Mentioned In
R65:4f4cc3022ab1: swh.web.config: Fix default configuration path lookup
R65:f8d69bcd3ba3: swh.web.config: Fix wrong default configuration
R65:bb50bb311a53: swh.web.config: Fix cache_uri and limits to the right level
R65:60eb70f2b566: packaging: Add missing swh.web.settings module
R65:2bbbe920199a: swh.web.settings: Setup cache uri (if any) from webapp.yml
R65:f9a82eed6019: swh.web.throttling: Clarify config keys intent
R65:702cf9783aa6: swh.web.settings: Split setup according to platforms (dev/prod)
rSPSITE2bbde9c2c339: deploy/webapp: Update icinga check to latest api content search
rSPSITE6b9590eb2861: memcached: add new profile for memcached server
rSPSITE50a5f9f77094: swh_api: replace profile::redis by profile::memcached
rDWAPPS4f4cc3022ab1: swh.web.config: Fix default configuration path lookup
rDWAPPSf8d69bcd3ba3: swh.web.config: Fix wrong default configuration
rSPPROF2bbde9c2c339: deploy/webapp: Update icinga check to latest api content search
rSPSITEd0daf2267c6a: data/defaults: Bump memcached to use 5% as maximum memory
rDWAPPS60eb70f2b566: packaging: Add missing swh.web.settings module
rSPSITE62dfa059d4d3: Remove redis module since it's no longer used
rSENV5836adbd2213: Remove redis module since it's no longer used
rSPSITEf55bbc33caeb: Revert "Puppetfile: Fix typo in memcached module version"
rSPSITE0901c7b360d7: Puppetfile: Fix typo in memcached module version
rSPSITE93b88b92dbfe: Puppetfile: Add puppet module for memcached
rSPSITE578616f27f16: data/defaults: Update webapp configuration
rSPROLE50a5f9f77094: swh_api: replace profile::redis by profile::memcached
rSPPROF6b9590eb2861: memcached: add new profile for memcached server
rDWAPPSf9a82eed6019: swh.web.throttling: Clarify config keys intent
rDWAPPSbb50bb311a53: swh.web.config: Fix cache_uri and limits to the right level
rDWAPPS2bbbe920199a: swh.web.settings: Setup cache uri (if any) from webapp.yml
rDWAPPS702cf9783aa6: swh.web.settings: Split setup according to platforms (dev/prod)
rSENV5d28c0eafd16: mrconfig: add saz-memcached
D243: swh.web.setting: Use different setup according to platform (dev/prod)

Event Timeline

zack renamed this task from Improve default rate limit backend for archive.softwareheritage.org to Web API: add back persistent and shared backend for rate limit counters.Sep 11 2017, 10:30 AM

15:08:07 ardumont | (well, i'd need to check for the version though, we're still in 1.7 in oldstable :)
15:08:19 ardumont | and the link is 1.11
15:08:45 ardumont | (it is present in 1.7 :)

And of course, 'do not believe the documentation' becomes a thing!

TL; DR

Not working.

Still:

  • archive.s.o is migrated, and run using an in-memory cache.
  • www.s.o is migrated as well
  • icinga checks ok

Detail

After deploying all this pretty stuff, the communication between django and its memcached server fails miserably with TypeError below:

Sep 12 13:42:15 moma python3[16211]: 2017-09-12 13:42:15 [16211] [ERROR] Internal Server Error: /api/1/stat/counters/
                                     Traceback (most recent call last):
                                       File "/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 111, in get_response
                                         response = wrapped_callback(request, *callback_args, **callback_kwargs)
                                       File "/usr/lib/python3/dist-packages/django/views/decorators/csrf.py", line 57, in wrapped_view
                                         return view_func(*args, **kwargs)
                                       File "/usr/lib/python3/dist-packages/django/views/generic/base.py", line 69, in view
                                         return self.dispatch(request, *args, **kwargs)
                                       File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 403, in dispatch
                                         response = self.handle_exception(exc)
                                       File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 391, in dispatch
                                         self.initial(request, *args, **kwargs)
                                       File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 322, in initial
                                         self.check_throttles(request)
                                       File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 296, in check_throttles
                                         if not throttle.allow_request(request, self):
                                       File "/usr/lib/python3/dist-packages/swh/web/common/throttling.py", line 60, in allow_request
                                         super(ScopedRateThrottle, self).allow_request(request, view)
                                       File "/usr/lib/python3/dist-packages/rest_framework/throttling.py", line 122, in allow_request
                                         self.history = self.cache.get(self.key, [])
                                       File "/usr/lib/python3/dist-packages/django/core/cache/backends/memcached.py", line 82, in get
                                         val = self._cache.get(key)
                                       File "/usr/lib/python3/dist-packages/memcache.py", line 995, in get
                                         return self._get('get', key)
                                       File "/usr/lib/python3/dist-packages/memcache.py", line 979, in _get
                                         return _unsafe_get()
                                       File "/usr/lib/python3/dist-packages/memcache.py", line 950, in _unsafe_get
                                         server.send_cmd("%s %s" % (cmd, key))
                                       File "/usr/lib/python3/dist-packages/memcache.py", line 1289, in send_cmd
                                         self.socket.sendall(cmd + '\r\n')
                                     TypeError: 'str' does not support the buffer interface

This is apparently known (no, i did not check before because i believed the documentation... TIL).

Anyway, this offers multiple choices:

  1. either go with the incomplete suggested workaround, use another memcached binding.

Using as django cache configuration something like:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
        'LOCATION': '127.0.0.1.11211',
    }
}

Install missing packages: libmemcached-dev, django-pylibmc (not debian packaged so this would need to be done).
Also, the incompleteness part, django-pyblibmc is compatible with django 1.8 onwards. We are running django 1.7.
The silver lining is django 1.8 is in jessie-backports so we'd only need to upgrade the manifest to deal with this.

  1. As suggested previously, migrate moma to stretch (puppet runs no problem there as in either new vm or apt-get dist-upgrade) . As a plus one for this, swh-deposit will need it anyway (due to some other unrelated debian package dependencies problem as well).

On a stretch box, it's working, no questions asked.

  1. pip on production is a nogo. Still, i mention it for sake of completeness (as far as i see). Solution 1. + pip for missing django dependencies.

I'm for solution 2.

If other solution exists, i'd be more than happy to be enlightened.

I'm for solution 2.
If other solution exists, i'd be more than happy to be enlightened.

@olasd implemented solution 2. by migrating moma from jessie to stretch.