Page MenuHomeSoftware Heritage

Performance issues in deployed web applications
Closed, MigratedEdits Locked

Description

Web applications deployed to moma have poor performances since a couple of days.

When the response to a request is not in cache, its delay of delivery is too long for
having a fluid browsing experience.

Below are some benchmarks between what it is deployed to production and
my local development server (uncached). Regarding the swh storage backend,
I use the same one as the production version: somerset.

$ curl -w "@curl-format.txt" -o /dev/null  -i http://localhost:5003/browse/origin/git/url/https://github.com/git/git/visit/2016-09-08T09:58:02/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  613k    0  613k    0     0   278k      0 --:--:--  0:00:02 --:--:--  278k
time_namelookup:  0,001479
       time_connect:  0,001828
    time_appconnect:  0,000000
   time_pretransfer:  0,001882
      time_redirect:  0,000000
 time_starttransfer:  2,195477
                    ----------
         time_total:  2,202124

$ curl -w "@curl-format.txt" -o /dev/null -u <user>:<password> -i https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/git/git/visit/2016-09-08T09:58:02/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  613k    0  613k    0     0  79081      0 --:--:--  0:00:07 --:--:--  164k
time_namelookup:  0,014480
       time_connect:  0,018318
    time_appconnect:  0,058750
   time_pretransfer:  0,058799
      time_redirect:  0,000000
 time_starttransfer:  7,917241
                    ----------
         time_total:  7,940864

$ curl -w "@curl-format.txt" -o /dev/null -i http://localhost:5003/browse/origin/git/url/https://github.com/Kitware/CMake/visit/2016-06-15T18:09:28/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  197k    0  197k    0     0   179k      0 --:--:--  0:00:01 --:--:--  179k
time_namelookup:  0,004578
       time_connect:  0,004769
    time_appconnect:  0,000000
   time_pretransfer:  0,004807
      time_redirect:  0,000000
 time_starttransfer:  1,091453
                    ----------
         time_total:  1,098664

$ curl -w "@curl-format.txt" -o /dev/null -u <user>:<password> -i https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/Kitware/CMake/visit/2016-06-15T18:09:28/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  197k    0  197k    0     0  23565      0 --:--:--  0:00:08 --:--:-- 46397
time_namelookup:  0,005758
       time_connect:  0,009316
    time_appconnect:  0,055092
   time_pretransfer:  0,055150
      time_redirect:  0,000000
 time_starttransfer:  8,553572
                    ----------
         time_total:  8,562562

$ curl -w "@curl-format.txt" -o /dev/null -u <user>:<password> -i https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/torvalds/linux/visit/2017-09-13T06:26:36/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  505k    0  505k    0     0  64847      0 --:--:--  0:00:07 --:--:--  105k
time_namelookup:  0,004525
       time_connect:  0,008921
    time_appconnect:  0,079652
   time_pretransfer:  0,079756
      time_redirect:  0,000000
 time_starttransfer:  7,955626
                    ----------
         time_total:  7,986906

$ curl -w "@curl-format.txt" -o /dev/null -i http://localhost:5003/browse/origin/git/url/https://github.com/torvalds/linux/visit/2017-09-13T06:26:36/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  505k    0  505k    0     0   337k      0 --:--:--  0:00:01 --:--:--  336k
time_namelookup:  0,004740
       time_connect:  0,004954
    time_appconnect:  0,000000
   time_pretransfer:  0,004987
      time_redirect:  0,000000
 time_starttransfer:  1,493119
                    ----------
         time_total:  1,500469

$ curl -w "@curl-format.txt" -o /dev/null -u <user>:<password> -i https://archive.softwareheritage.org/browse/origin/git/url/https://github.com/python/cpython/visit/2016-09-07T06:26:03/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 74692    0 74692    0     0  10872      0 --:--:--  0:00:06 --:--:-- 20413
time_namelookup:  0,006058
       time_connect:  0,009963
    time_appconnect:  0,057815
   time_pretransfer:  0,057864
      time_redirect:  0,000000
 time_starttransfer:  6,866549
                    ----------
         time_total:  6,869975

$ curl -w "@curl-format.txt" -o /dev/null -i http://localhost:5003/browse/origin/git/url/https://github.com/python/cpython/visit/2016-09-07T06:26:03/directory/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 74692    0 74692    0     0  84902      0 --:--:-- --:--:-- --:--:-- 84877
time_namelookup:  0,004967
       time_connect:  0,005321
    time_appconnect:  0,000000
   time_pretransfer:  0,005393
      time_redirect:  0,000000
 time_starttransfer:  0,872110
                    ----------
         time_total:  0,879735

Event Timeline

zack raised the priority of this task from Normal to High.Jan 25 2018, 10:14 AM
zack added a project: Restricted Project.
zack moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

The performance issue is visible doing curl on localhost:5003 on moma, which rules out (at least for now) configuration issues on apache, varnish or hitch.

Sigh to the power of sigh.

Due to our database tuning, optimized for very short reads and frequent writes, postgresql performance sharply decreases when one transaction has been running for very long on the server.

As it turns out, one such very-long-running transaction was running on the replica on somerset: the dump of all occurrences that are being converted to snapshots. Killing that query restored performance to nominal values.