diff --git a/talks-public/2020-swh-team-onboarding/2020-swh-team-onboarding.org b/talks-public/2020-swh-team-onboarding/2020-swh-team-onboarding.org index 49fc004..046a2d6 100644 --- a/talks-public/2020-swh-team-onboarding/2020-swh-team-onboarding.org +++ b/talks-public/2020-swh-team-onboarding/2020-swh-team-onboarding.org @@ -1,235 +1,311 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software Heritage #+SUBTITLE: Welcome on Board! #+BEAMER_HEADER: \date[2020-09-01, Paris]{1 September 2020\\Inria Paris} #+AUTHOR: The Software Heritage team #+DATE: 1 September 2020 #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # Syntax highlighting setup #+LATEX_HEADER_EXTRA: \usepackage{minted} #+LaTeX_HEADER_EXTRA: \usemintedstyle{tango} #+LaTeX_HEADER_EXTRA: \newminted{sql}{fontsize=\scriptsize} #+name: setup-minted #+begin_src emacs-lisp :exports results :results silent (setq org-latex-listings 'minted) (setq org-latex-minted-options '(("fontsize" "\\scriptsize"))) (setq org-latex-to-pdf-process '("pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f" "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f" "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f")) #+end_src # End syntax highlighting setup * Project overview ** Software Heritage in a nutshell \hfill [[https://softwareheritage.org][softwareheritage.org]] #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 ** An international, non profit initiative\hfill built for the long term :PROPERTIES: :CUSTOM_ID: support :END: *** Sharing the vision :B_block: :PROPERTIES: :CUSTOM_ID: endorsement :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} #+LATEX: \vspace{-0.8cm} #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} *** Donors, members, sponsors :B_block: :PROPERTIES: :CUSTOM_ID: sponsors :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} #+LATEX: \begin{center} #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/sponsors} #+LATEX: \end{center} ** Status :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+INCLUDE: "../../common/modules/status-extended.org::#archivinggoals" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#architecture" :minlevel 2 :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#merkletree" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#datamodel" :minlevel 2 :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#dagdetailsmall" :minlevel 2 :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#archive" :minlevel 2 * Software stack ** Overall architecture *** It's just a (big) database 2 parts: - the object storage: swh-objstorage - store blob objects - content addressable - typically accessed via an HTTP RPC API - multiple backends supported (local FS, S3, Azure, Ceph, ...) - the graph storage: swh-storage - stores the Merkle DAG (+ other things) - provide access to the data according the data model declared in swh.model - typically accessed via an HTTP RPC API - multiple backend supported (postgresql, casssandra) ** Overall architecture *** It's an append-only database - we never (almost) modify entries in the main database - both storages (obj and graph) are expected to be idempotent ** Overall architecture *** With a bit of tooling - A frontend interface: swh-web - Provides both the main GUI and a public (REST-like) API - A scraping scaffolding - Listers: look for origins to ingest - Loaders: ingest origins - An indexing machine - Crunches objects in the archive and generates metadata - A metadata storage - Heavily under construction - Important distinction between intrinsic and extrinsic metadata ** Overall architecture *** Using a bit of code #+BEAMER: \vspace{1mm} #+BEAMER: \centering \includegraphics[width=\extblockscale{1.4\linewidth}]{swh-modules-deps-internal} Actually it's not so big: - ~20ksloc of python3 - ~80 python dependencies - a bunch of js - ... keep it as simple as possible, but no simpler... (almost) ** The big picture #+BEAMER: \vspace{1mm} #+BEAMER: \centering \includegraphics[height=.8\textheight]{general-architecture} [[https://docs.softwareheritage.org/devel/architecture.html][More details in our docs]] #+INCLUDE: "../../common/modules/status-extended.org::#swstack" :minlevel 2 * Development workflow ** Starting points *** Development documentation https://docs.softwareheritage.org/devel/ - in particular, Developer setup: https://docs.softwareheritage.org/devel/developer-setup.html - i.e.: virtualenv + pip + tox *** "Software Development" pages on the public wiki https://wiki.softwareheritage.org/wiki/Category:Software_development (most of these will be covered in the following) ** Development forge #+BEAMER: \vspace{-2mm} *** Phabricator https://forge.softwareheritage.org/ - all development activities happen here - take the time to get familiar and become efficient using Phabricator #+BEAMER: \vspace{-2mm} *** The classics - VCS: Git, with repo browsing using Diffusion https://forge.softwareheritage.org/diffusion/ - Tasks and Bugs: Maniphest https://forge.softwareheritage.org/maniphest/ - one project tag for each software product, e.g., Git Loader: https://forge.softwareheritage.org/project/view/17/ - we use task priorities, assignees, and tags (not much the per-product kanban boards) - visibility: all dev tasks are public (they can be made private moving them to the space "S2: Staff", but it has never happened) - you will need one task associated to each planned dev activity ** Development forge (cont.) *** The classics (cont.) - Code review: Differential https://forge.softwareheritage.org/differential/ (more on this later) - Communication - English - day-by-day: in the relevant task on the forge - async: swh-devel mailing list https://sympa.inria.fr/sympa/info/swh-devel - sync: IRC, #swh-devel channel on FreeNode - ad-hoc pokes (in person) or calls (remote), as needed ** Code reviews code review is not mandatory, but recommended in most cases - guidelines https://wiki.softwareheritage.org/wiki/Code_review - technical setup: https://wiki.softwareheritage.org/wiki/Code_review_in_Phabricator ** QA: linting and testing - most of the code we write is Python. Python code is all Python 3 - code formatting: fully automated via black https://black.readthedocs.io/ - code linting / static analysis - flake8 https://flake8.pycqa.org/ (usually trivial, thanks to black) - type checking via mypy http://mypy-lang.org/ - WIP, type coverage vary depending on the module - rule of thumb: always type new code; opportunistically add typing to old code - code testing - pytest - code coverage goal: >= 80% SLOCs of each module - do try this +at home+ locally (e.g., before pushing a diff or commit) #+BEGIN_EXAMPLE $ tox #+END_EXAMPLE ** Continuous integration - Jenkins: https://jenkins.softwareheritage.org/ - integrated with Phabricator 1) CI runs on each submitted diff, reporting back results in the diff 2) CI runs on each landed commit, notifying author in case of failures 3) CI runs daily on the entire software stack, notifying #swh-devel of failures ** Style guidelines - Git style guide [[https://wiki.softwareheritage.org/wiki/Git_style_guide][=wiki.softwareheritage.org/wiki/Git_style_guide=]] - Python style guide [[https://wiki.softwareheritage.org/wiki/Python_style_guide][=wiki.softwareheritage.org/wiki/Python_style_guide=]] check for adherence to these during code reviews! * Infrastructure -** to be written - TODO (=olasd= in charge of a first draft) - -** TODO Deployment of SWH Python modules - Zack: this can go either here or before under development workflow. From the - point of view of developers we need to say that =git tag= is enough; from - the point of view of the sysadm we can add more details about what happens - behind the scene. +** Hardware/network architecture + - **"on"-premises resources**: servers and virtual machines in the Inria Rocquencourt datacenters + - **donated resources**: cloud credits on Azure, storage space on S3, VMs at the university of Bologna + - Work in progress: terraform for definition of cloud resources / on premises virtual machines + +** Hardware/network (on premises) + - hosted by Inria in Rocquencourt + - 2 racks + 1 server in another room + - 2 x 10Gbps top of rack switches in each rack, with 40Gbps interconnect + - proxmox-based virtualization cluster (4 hypervisors, built-in ceph shared storage, ~40 VMs) + - Dedicated servers for specific needs (PostgreSQL, Kafka, ElasticSearch) + - Two dedicated servers with attached SAS enclosures for object storage (one in each server room) + - One full copy of the SWH archive + - Most production services + staging environment + +** Hardware/network (Azure) + - Everthing in the West Europe / Amsterdam zone + - one full copy of the SWH archive: + - one copy of the raw object storage on the azure blob storage + - one copy of the main PostgreSQL database + - one fallback web interface to the archive + - deployment of the production vault service + - indexer workers + - some virtual machines for experiments or punctual loads + +** Hardware/network (Other sponsored resources) + - One copy of the object storage on Amazon S3 (west US zone) + - Two virtual machines with block storage at the university of Bologna (Vault service) + +** Network architecture + - All on-premises servers and VMs connected to a flat "internal" network + - Separate VLAN for on-premises public IPv4s + - currently shared with other services hosted at Inria Rocquencourt + - migration to a dedicated SWH public VLAN pending + - Separate VLAN for "internal" staging area + - needs to be migrated/rearchitected to allow public access + - IPSec tunnel connecting our internal network with Azure VMs + - Azure has "magic" public IP routing for VMs with a public IP + - OpenVPN network connecting roaming clients (us) and other external resources + - Most services have public access; VPN only needed for: + - staging infra access + - backend access to the production infra + +** Software architecture + - all machines running Debian (stable/buster) + - Configuration management via [[https://forge.softwareheritage.org/source/puppet-environment/][puppet]] (run every 30 minutes) + - All Software Heritage python modules published to PyPI (when tagged) + - All deployments (staging and prod) through Debian packages built from the tarballs published on PyPI + - Most Debian package updates are done automatically (via ~unattended-upgrades~) + - some exceptions: SWH modules; PostgreSQL servers; ElasticSearch (pinned to a specific version) + - no automatic service restarts (yet, TODO) + - Some specific stuff deployed via tarballs (kafka) + +** Packaging of SWH Python modules + - publish a git tag on the module (ideally, annotated and GPG signed with ~git tag -as vx.y.z~) + - Phabricator triggers a /tag received/ Jenkins job + - The /tag received/ Jenkins job runs a /build PyPI package/ job, if enabled + - this runs the tests on the tagged version + - then publishes a sdist and a wheel to PyPI + - then triggers the /update Debian packaging for new version/ job + - The /update Debian packaging for new version/ jenkins job updates the ~debian/unstable-swh~ branch of the repo, and pushes a tag to it + - this incoming tag triggers a new /build Debian package/ jenkins job + - in unstable, this builds the package, runs the tests, uploads it to our archive, then triggers a /backporting job/ for Debian stable + - the /backporting job/ updates the ~debian/buster-swh~ branch of the repo, and pushes a tag to it, which triggers the build of the stable package + - When a manual update is needed, do it in the ~debian/unstable-swh~ branch and push a new tag to it + +** Deployment of SWH Python modules + - Once updated modules are available in our custom Debian package repository, puppet takes the helm + - ~swh-web~ gets upgraded to the latest version automatically on frontend hosts + - other modules are upgraded manually on each server/vm depending on circumstances and service requirements + - clustershell / tmux-xpanes is our friend, for now + - Why? + - poor/absent integration of Python module version constraints with the automatic debian packaging + - coordinated deployment between backend databases, backend services and service users + - Dream future + - continuous deployment and upgrades on a publically-accessible staging infra + - tested upgrades on production snapshots (resource-intensive !) + - "automatic" coordinated production deployments for 95% of use-cases + +** Production insights + - Exception Reports on the [[https://sentry.softwareheritage.org/][sentry (invite needed)]] instance + - Logging of all deployed services to ElasticSearch and access through [[https://kibana0.internal.softwareheritage.org/][kibana (VPN needed)]] + - Metrics collected by Prometheus and published in [[https://grafana.softwareheritage.org][Grafana (public access)]] + - Service health monitoring via [[https://icinga.softwareheritage.org/][Icinga2 (guest/guest)]], + IRC bot on the ~#swh-sysadm~ channel * AdministraInriatrivia ** Onboarding procedure *** https://intranet.softwareheritage.org/wiki/Onboarding *** - let's walk through it together... - a team member will be assigned as your mentor to complete it ** Team charter *** https://intranet.softwareheritage.org/wiki/Team_charter *** Highlights - weekly meeting(s) - public weekly reporting - persistent IRC connection ** TODO to be completed what else should be added to this section? * Appendix :B_appendix: :PROPERTIES: :BEAMER_env: appendix :END: