Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F9312358
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
8 KB
Subscribers
None
View Options
diff --git a/sysadm/deployment/upgrade-swh-service.rst b/sysadm/deployment/upgrade-swh-service.rst
index e677cb7..18f571a 100644
--- a/sysadm/deployment/upgrade-swh-service.rst
+++ b/sysadm/deployment/upgrade-swh-service.rst
@@ -1,151 +1,231 @@
.. _upgrade-swh-service:
Upgrade swh service
===================
.. admonition:: Intended audience
:class: important
sysadm staff members
-Workers
--------
-Dedicated workers [1] run our *swh-worker@loader_{git, hg, svn, npm, ...}* services.
-When a new version is released, we need to upgrade their package(s).
+The document describes the deployment for most of our swh services (rpc services,
+loaders, listers, indexers, ...).
-[1] Here are the following group name (in `clush
-<https://clustershell.readthedocs.io/en/latest/index.html>`_ terms):
+There exists currently 2 ways (as we are transitioning from the first to the second):
-- *@swh-workers* for the production workers
-- *@azure-workers* for the production ones running on azure
-- *@staging-loader-workers* for the staging ones
+- static: From git tag to deployment through debian packaging
+- elastic: From git tag to deployment through kubernetes.
+
+
+The following will first describe the :ref:`common deployment part <code-and-publish>`.
+This involves some python packaging out of a git tag which will be built and push to
+`PyPI <https://pypi.org>`_ and our :ref:`swh debian repositories
+<howto-debian-packaging>`.
+
+Then follows the actual :ref:`deployment with debian packaging
+<deployment-with-debian-packaging>`. It concludes with the :ref:`deployment with
+kubernetes<deployment-with-kubernetes>` chapter.
+
+.. _distinct-services:
+
+Distinct Services
+-----------------
+
+3 kinds services runs on our nodes:
+
+- worker services (loaders, listers, cookers, ...)
+- rpc services (scheduler, objstorage, storage, web, ...)
+- journal client services (search, scheduler, indexer)
+
+.. _code-and-publish:
-See :ref:`deploy-new-lister` for a practical example.
Code and publish
----------------
-.. _fix-or-evolve-code:
-
Code an evolution of fix an issue in the python code within the git repository's master
-branch. Open a diff for review, land it when accepted, and start back at :ref:`tag and push
-<tag-and-push>`.
+branch. Open a diff for review, land it when accepted, and start back at :ref:`tag and
+push <tag-and-push>`.
.. _tag-and-push:
Tag and push
~~~~~~~~~~~~
-When ready, `git tag` and `git push` the new tag of the module.
+When ready, `git tag` and `git push` the new tag of the module. And let jenkins
+:ref:`publish the artifact <publish-artifacts>`.
.. code::
- $ git tag vA.B.C
+ $ git tag -a vA.B.C # (optionally) `git tag -a -s` to sign the tag too
$ git push origin --follow-tags
-.. _publish-and-deploy:
+.. _publish-artifacts:
+
+Publish artifacts
+~~~~~~~~~~~~~~~~~
-Publish and deploy
-~~~~~~~~~~~~~~~~~~
+Jenkins is in charge to publishing to `PyPI <https://pypi.org>`_ the new release (out of
+the tag). And then building the debian packaging and push it package to our :ref:`swh
+debian repositories <howto-debian-packaging>`.
-Let jenkins publish and deploy the debian package.
.. _troubleshoot:
Troubleshoot
~~~~~~~~~~~~
If jenkins fails for some reason, fix the module be it :ref:`python code
<fix-or-evolve-code>` or the :ref:`debian packaging <troubleshoot-debian-package>`.
+
+.. _deployment-with-debian-packaging:
+
+Deployment with debian packaging
+--------------------------------
+
+This mostly involves deploying new version of debian packages to static nodes.
+
+.. _upgrade-services:
+
+Upgrade services
+~~~~~~~~~~~~~~~~
+
+When a new version is released, we need to upgrade the package(s) and restart services.
+
+worker services (production):
+
+- *swh-worker@loader_{git, hg, svn, npm, ...}*
+- *swh-worker@lister*
+- *swh-worker@vault_cooker*
+
+journal clients (production):
+
+- *swh-indexer-journal-client@{origin_intrinsic_metadata_,extrinsic_metadata_,...}*
+
+rpc services (both environment):
+
+- *gunicorn-swh-{scheduler, objstorage, storage, web, ...}*
+
+
+From the pergamon node, which is configured for `clush
+<https://clustershell.readthedocs.io/en/latest/index.html>`_, one can act on multiple
+nodes through the following group names:
+
+- *@swh-workers* for the production workers (listers, loaders, ...)
+- *@azure-workers* for the production ones running on azure (indexers, cookers)
+- ...
+
+See :ref:`deploy-new-lister` for a practical example.
+
.. _troubleshoot-debian-package:
Debian package troubleshoot
~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In that case, upgrade and checkout the *debian/unstable-swh* branch, then fix whatever
-is not updated or broken due to a change. It's usually a missing new package dependency
-to fix in *debian/control*). Add a new entry in *debian/changelog*. Make sure gbp builds
-fine. Then tag it. Jenkins will build the package anew.
+Update and checkout the *debian/unstable-swh* branch (in the impacted git repository),
+then fix whatever is not updated or broken due to a change.
+
+It's usually a missing new package dependency to fix in *debian/control*). Add a new
+entry in *debian/changelog*. Make sure gbp builds fine locally. Then tag it and push.
+Jenkins will build the package anew.
.. code::
$ gbp buildpackage --git-tag-only --git-sign-tag # tag it
$ git push origin --follow-tags # trigger the build
+Lather, rinse, repeat until it's all green!
+
Deploy
------
-.. _nominal_case:
+.. _nominal-case:
Nominal case
~~~~~~~~~~~~
-Update the machine dependencies and restart service. That usually means
-as sudo user:
+Update the machine dependencies and restart service. That usually means as sudo user:
.. code::
$ apt-get update
$ apt-get dist-upgrade -y
- $ systemctl restart swh-worker@loader_${type}
+ $ systemctl restart $service
Note that this is for one machine you ssh into.
We usually wrap those commands from the sysadmin machine pergamon [3] with the *clush*
command, something like:
.. code::
$ sudo clush -b -w @swh-workers 'apt-get update; env DEBIAN_FRONTEND=noninteractive \
apt-get -o Dpkg::Options::="--force-confdef" \
-o Dpkg::Options::="--force-confold" -y dist-upgrade'
[3] pergamon is already *clush* configured to allow multiple ssh connections in parallel
on our managed infrastructure nodes.
.. _configuration-change-required:
Configuration change required
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Either wait for puppet to actually deploy the changes first and then go back to the
nominal case.
Or force a puppet run:
.. code::
- sudo clush -b -w @swh-workers puppet agent -t
+ sudo clush -b -w $nodes puppet agent -t
Note: *-t* is not optional
-.. _long-standing-migration:
+.. _long-standing-upgrade:
-Long-standing migration
-~~~~~~~~~~~~~~~~~~~~~~~
+Long-standing upgrade
+~~~~~~~~~~~~~~~~~~~~~
-In that case, you may need to stop all services for migration which could take some time
-(because lots of data is migrated for example).
+In that case, you may need to stop the impacted services. For example, for long standing
+data model migration which could take some time.
-You need to momentarily stop puppet (which runs every 30 min to apply manifest changes)
-and the cron service (which restarts down services) on the workers nodes.
+You need to momentarily stop puppet (which by default runs every 30 min to apply
+manifest changes) and the cron service (which restarts down services) on the workers
+nodes.
Report yourself to the :ref:`storage database migration <storage-database-migration>`
for a concrete case of database migration.
.. code::
$ sudo clush -b -w @swh-workers 'systemctl stop cron.service; puppet agent --disable'
+
Then:
-- Execute the database migration.
-- Go back to the nominal case.
-- Restart puppet and the cron on workers
+- Execute the long-standing upgrade.
+- Go back to the :ref:`nominal case <nominal-case>`.
+- Restart puppet and the cron services on workers
.. code::
$ sudo clush -b -w @swh-workers 'systemctl start cron.service; puppet agent --enable'
+
+.. _deployment-with-kubernetes:
+
+Deployment with Kubernetes
+--------------------------
+
+.. warning:: FIXME Enter into details + add a small summary graph
+
+- swh-apps: Add new apps (new Dockerfile)
+- swh-apps: Build frozen requirements for a new release of a swh service
+- swh-apps: Build impacted docker images with that frozen set of requirements
+- Commit and tag
+- Push built docker image into our gitlab registry
+- swh-charts: Add/Update the image versions
+- Commit and push
File Metadata
Details
Attached
Mime Type
text/x-diff
Expires
Thu, Jul 3, 10:50 AM (1 w, 5 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3277051
Attached To
rDDOC Development documentation
Event Timeline
Log In to Comment