diff --git a/docs/developer-setup.rst b/docs/developer-setup.rst --- a/docs/developer-setup.rst +++ b/docs/developer-setup.rst @@ -3,20 +3,35 @@ Developer setup =============== -In this guide, we will set up a dual environment: +In this guide we describe how to set up a developer environment in which one +can easily navigate the source code, make modifications, write and execute unit +tests. -- A virtual env in which all the |swh| packages will be installed in 'develop' - mode, this will allow you to navigate the source code, hack it, and run - locally the unit tests. +For this, we will use a `virtualenv`_ in which all the |swh| packages will be +installed in 'develop' mode, this will allow you to navigate the source code, +hack it, and run locally the unit tests. -- A docker 'cluster' built with docker-compose, which allows to easily run all - the components of the |swh| architecture. It is possible to run those docker - containers with your locally modified code for one or several |swh| packages. +If you want to test the effect of your modifications in a running |swh| +instance, you should check the `documentation`_ of the swh-docker-dev_ project. - Please read the `README file`_ in the swh-docker-dev repository for more - details on how to do this. +.. _`documentation`: https://forge.softwareheritage.org/source/swh-docker-dev/browse/master/README.md?as=remarkup +.. _`swh-docker-dev`: https://forge.softwareheritage.org/source/swh-docker-dev +.. _`virtualenv`: https://pypi.org/project/virtualenv/ + + +Install required dependencies +----------------------------- + +Software Heritage requires some dependencies that are usually packaged by your +package manager. On Debian/Ubuntu-based distributions:: + + sudo wget https://www.postgresql.org/media/keys/ACCC4CF8.asc -O /etc/apt/trusted.gpg.d/postgresql.asc + sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' + sudo apt update + sudo apt install python3 python3-venv libsvn-dev postgresql-11 \ + libsystemd-dev libpython3-dev graphviz postgresql-autodoc \ + postgresql-server-dev-all virtualenvwrapper git build-essential -.. _`README file`: https://forge.softwareheritage.org/source/swh-docker-dev/browse/master/README.md Checkout the source code ------------------------ @@ -28,19 +43,21 @@ ~$ cd swh-environment ~/swh-environment$ -Create a virtual env:: +Checkout all the swh packages source repositories:: + + ~/swh-environment$ ./bin/update + +Create a virtualenv:: ~/swh-environment$ mkvirtualenv -p /usr/bin/python3 -a $PWD swh [...] (swh) ~/swh-environment$ - -.. Note: using virtualenvwrapper_ is not mandatory here. You can use plain +.. Note:: using virtualenvwrapper_ is not mandatory here. You can use plain virtualenvs, or any other venv management tool (pipenv_ or poetry_ for example). Using a tool such as virtualenvwrapper_ just makes life easier... - .. _virtualenvwrapper: https://virtualenvwrapper.readthedocs.io/ .. _poetry: https://poetry.eustace.io/ .. _pipenv: https://pipenv.readthedocs.io/ @@ -53,110 +70,131 @@ [...] -Setup the docker environment ----------------------------- - -Install docker-compose:: - - (swh) ~/swh-environment$ pip install docker-compose - [...] - -Make your life easier:: - - (swh) ~/swh-environment$ cat >>$VIRTUAL_ENV/bin/postactivate < database=DirectoryBasedExampleDatabase('/home/ddouard/src/swh-environment/swh-loader-git/.hypothesis/examples') + rootdir: /home/ddouard/src/swh-environment/swh-loader-git, inifile: pytest.ini + plugins: requests-mock-1.5.2, postgresql-1.3.4, env-0.6.2, django-3.4.7, cov-2.6.0, pylama-7.6.5, hypothesis-3.76.0, celery-4.2.1 + collected 25 items + + swh/loader/git/tests/test_converters.py ........ [ 32%] + swh/loader/git/tests/test_from_disk.py ..... [ 52%] + swh/loader/git/tests/test_loader.py ...... [ 76%] + swh/loader/git/tests/test_tasks.py ... [ 88%] + swh/loader/git/tests/test_utils.py ... [100%] + ============================= warnings summary ============================= + [...] + ================== 25 passed, 12 warnings in 6.66 seconds ================== + +Running the same test, plus flake8 checks, using tox:: + + (swh) ~/swh-environment/swh-loader-git$ tox + GLOB sdist-make: ~/swh-environment/swh-loader-git/setup.py + flake8 create: ~/swh-environment/swh-loader-git/.tox/flake8 + flake8 installdeps: flake8 + flake8 installed: entrypoints==0.3,flake8==3.7.7,mccabe==0.6.1,pycodestyle==2.5.0,pyflakes==2.1.1,swh.loader.git==0.0.48.post3 + flake8 run-test-pre: PYTHONHASHSEED='2028963506' + flake8 runtests: commands[0] | ~/swh-environment/swh-loader-git/.tox/flake8/bin/python -m flake8 + py3 create: ~/swh-environment/swh-loader-git/.tox/py3 + py3 installdeps: .[testing], pytest-cov + py3 inst: ~/swh-environment/swh-loader-git/.tox/.tmp/package/1/swh.loader.git-0.0.48.post3.zip + py3 installed: aiohttp==3.5.4,amqp==2.4.2,arrow==0.13.1,async-timeout==3.0.1,atomicwrites==1.3.0,attrs==19.1.0,billiard==3.5.0.5,celery==4.2.1,certifi==2018.11.29,chardet==3.0.4,Click==7.0,coverage==4.5.2,decorator==4.3.2,dulwich==0.19.11,elasticsearch==6.3.1,Flask==1.0.2,idna==2.8,idna-ssl==1.1.0,itsdangerous==1.1.0,Jinja2==2.10,kombu==4.4.0,MarkupSafe==1.1.1,more-itertools==6.0.0,msgpack-python==0.5.6,multidict==4.5.2,pathlib2==2.3.3,pluggy==0.9.0,psutil==5.6.0,psycopg2==2.7.7,py==1.8.0,pytest==3.10.1,pytest-cov==2.6.1,python-dateutil==2.8.0,pytz==2018.9,PyYAML==3.13,requests==2.21.0,retrying==1.3.3,six==1.12.0,swh.core==0.0.55,swh.loader.core==0.0.39,swh.loader.git==0.0.48.post3,swh.model==0.0.30,swh.objstorage==0.0.30,swh.scheduler==0.0.49,swh.storage==0.0.129,systemd-python==234,typing-extensions==3.7.2,urllib3==1.24.1,vcversioner==2.16.0.0,vine==1.2.0,Werkzeug==0.14.1,yarl==1.3.0 + py3 run-test-pre: PYTHONHASHSEED='2028963506' + py3 runtests: commands[0] | pytest --cov=swh --cov-branch + =========================== test session starts ============================ + platform linux -- Python 3.5.3, pytest-3.10.1, py-1.8.0, pluggy-0.9.0 + rootdir: ~/swh-environment/swh-loader-git, inifile: pytest.ini + plugins: cov-2.6.1, celery-4.2.1 + collected 25 items + + swh/loader/git/tests/test_converters.py ........ [ 32%] + swh/loader/git/tests/test_from_disk.py ..... [ 52%] + swh/loader/git/tests/test_loader.py ...... [ 76%] + swh/loader/git/tests/test_tasks.py ... [ 88%] + swh/loader/git/tests/test_utils.py ... [100%] + + ----------- coverage: platform linux, python 3.5.3-final-0 ----------- + Name Stmts Miss Branch BrPart Cover + --------------------------------------------------------------------------- + swh/__init__.py 1 0 0 0 100% + swh/loader/__init__.py 1 0 0 0 100% + swh/loader/git/__init__.py 0 0 0 0 100% + swh/loader/git/converters.py 102 10 44 7 86% + swh/loader/git/from_disk.py 157 44 50 6 67% + swh/loader/git/loader.py 271 59 114 17 75% + swh/loader/git/tasks.py 14 0 0 0 100% + swh/loader/git/tests/__init__.py 1 0 0 0 100% + swh/loader/git/tests/conftest.py 4 0 0 0 100% + swh/loader/git/tests/test_converters.py 94 0 6 0 100% + swh/loader/git/tests/test_from_disk.py 100 4 0 0 96% + swh/loader/git/tests/test_loader.py 12 0 0 0 100% + swh/loader/git/tests/test_tasks.py 26 0 0 0 100% + swh/loader/git/tests/test_utils.py 14 0 2 0 100% + swh/loader/git/utils.py 25 8 8 1 61% + --------------------------------------------------------------------------- + TOTAL 822 125 224 31 80% + + + ============================= warnings summary ============================= + .tox/py3/lib/python3.5/site-packages/psycopg2/__init__.py:144 + ~/swh-environment/swh-loader-git/.tox/py3/lib/python3.5/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: . + """) + + -- Docs: https://docs.pytest.org/en/latest/warnings.html + ================== 25 passed, 1 warnings in 7.34 seconds =================== + _________________________________ summary __________________________________ + flake8: commands succeeded + py3: commands succeeded + congratulations :) + +Beware that some swh packages require a postgresql server properly configured +to execute the tests. In this case, you will want to use pifpaf_, which will +spawn a temporary instance of postgresql, to encapsulate the call to pytest. +For example, running pytest in the swh-core package:: + + (swh) ~/swh-environment$ cd swh-core + (swh) ~/swh-environment/swh-core$ pifpaf run postgresql -- pytest + =========================== test session starts ============================ + platform linux -- Python 3.5.3, pytest-3.8.2, py-1.6.0, pluggy-0.7.1 + hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/ddouard/src/swh-environment/swh-core/.hypothesis/examples') + rootdir: /home/ddouard/src/swh-environment/swh-core, inifile: pytest.ini + plugins: requests-mock-1.5.2, postgresql-1.3.4, env-0.6.2, django-3.4.7, cov-2.6.0, pylama-7.6.5, hypothesis-3.76.0, celery-4.2.1 + collected 79 items + + swh/core/tests/test_api.py .. [ 2%] + swh/core/tests/test_config.py .............. [ 20%] + swh/core/tests/test_db.py .... [ 25%] + swh/core/tests/test_logger.py . [ 26%] + swh/core/tests/test_serializers.py ..... [ 32%] + swh/core/tests/test_statsd.py ...................................... [ 81%] + ........ [ 91%] + swh/core/tests/test_utils.py ....... [100%] + + ======================== 79 passed in 6.59 seconds ========================= + + + + +.. _pytest: https://pytest.org +.. _tox: https://tox.readthedocs.io +.. _pypi: https://pypi.org +.. _swh-loader-git: https://forge.softwareheritage.org/source/swh-loader-git +.. _pifpaf: https://github.com/jd/pifpaf diff --git a/docs/getting-started.rst b/docs/getting-started.rst --- a/docs/getting-started.rst +++ b/docs/getting-started.rst @@ -29,7 +29,7 @@ - http://localhost:5080/ to navigate your (empty for now) SWH archive, - http://localhost:5080/rabbitmq to access the rabbitmq dashboard (guest/guest), -- http://localhost:5080/prometheus to explore the platform's metrics, +- http://localhost:5080/grafana to explore the platform's metrics, All the internal APIs are also exposed: @@ -53,15 +53,10 @@ If you want to hack the code of the Software Heritage Archive, a bit more work will be required. -The best way to have a development-friendly environment is to build a mixed -docker/virtual env setup. - -Such a setup is described in the :ref:`Perfect Developer Setup guide -`. +To be able to write patches, you will need a development setup. +The best way to have a development-friendly environment is to build a mixed +docker/virtualenv setup. -Installing from sources (without Docker) -++++++++++++++++++++++++++++++++++++++++ - -If you prefer to run everything straight, you should refer to the :ref:`Manual -Setup Guide ` +Such a setup is described in the +:ref:`Developer Setup Guide `. diff --git a/docs/manual-setup.rst b/docs/manual-setup.rst deleted file mode 100644 --- a/docs/manual-setup.rst +++ /dev/null @@ -1,227 +0,0 @@ -.. _manual-setup: - -Step 0 --- get the code ------------------------ - -The `swh-environment -`_ Git (meta) -repository orchestrates the Git repositories of all Software Heritage modules. -Clone it:: - - git clone https://forge.softwareheritage.org/source/swh-environment.git - -then recursively clone all Python module repositories. For this step you will -need the `mr `_ tool. Once you have installed -``mr``, just run:: - - cd swh-environment - bin/update - -.. IMPORTANT:: - - From now on this tutorial will assume that you **run commands listed below - from within the swh-environment** directory. - -For periodic repository updates just re-run ``bin/update``. - - -Step 1 --- install system dependencies --------------------------------------- - -You need to install three types of dependencies: some base packages, Node.js -modules (for the web app), and Postgres (as storage backend). - -Package dependencies -~~~~~~~~~~~~~~~~~~~~ - -Software Heritage requires some dependencies that are usually packaged by your -package manager. On Debian/Ubuntu-based distributions:: - - sudo apt-get install curl ca-certificates - curl https://deb.nodesource.com/setup_8.x | sudo bash - curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - - sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' - sudo apt update - sudo apt install python3 python3-venv libsvn-dev postgresql-10 nodejs \ - libsystemd-dev libpython3-dev dia postgresql-autodoc \ - postgresql-server-dev-all - -Postgres -~~~~~~~~ - -You need a running Postgres instance with administrator access (e.g., to create -databases). On Debian/Ubuntu based distributions, the previous step -(installation) should be enough. - -For other platforms and more details refer to the `PostgreSQL installation -documentation -`_. - -You also need to have access to a superuser account on the database. For that, -the easiest way is to create a PostgreSQL account that has the same name as -your username:: - - sudo -u postgres createuser --createdb --superuser $USER - -You can check that this worked by doing, from your user (you should not be -asked for a password):: - - psql postgres - -Node.js modules -~~~~~~~~~~~~~~~ - -If you want to run the web app to browser your local archive you will need some -Node.js modules, in particular to pack web resources into a single compact -file. To that end the following should suffice:: - - cd swh-web - npm install - cd - - -You are now good to go with all needed dependencies on your development -machine! - - -Step 2 --- install Python packages in a virtualenv --------------------------------------------------- - -From now on you will need to work in a `virtualenv -`_ containing the Python -environment with all the Software Heritage modules and their dependencies. To -that end you can do (once):: - - python3 -m venv .venv - -Then, activate the virtualenv (do this every time you start working on Software -Heritage):: - - source .venv/bin/activate - -You can now install Software Heritage Python modules, their dependencies and -the testing-related dependencies using:: - - pip install $( bin/pip-swh-packages --with-testing ) - - -Step 3 --- set up storage -------------------------- - -Then you will need a local storage service that will archive and serve source -code artifacts via a REST API. The Software Heritage storage layer comes in two -parts: a content-addressable :term:`object storage` on your file system (for file -contents) and a Postgres database (for the graph structure of the archive). See -the :ref:`data-model` for more information. The storage layer is configured via -a YAML configuration file, located at -``~/.config/swh/storage/storage.yml``. Create it with a content like: - -.. code-block:: yaml - - storage: - cls: local - args: - db: "dbname=softwareheritage-dev" - objstorage: - cls: pathslicing - args: - root: /srv/softwareheritage/objects/ - slicing: 0:2/2:4 - -Make sure that the :term:`object storage` root exists on the filesystem and is writable -to your user, e.g.:: - - sudo mkdir -p /srv/softwareheritage/objects - sudo chown "${USER}:" /srv/softwareheritage/objects - -You are done with :term:`object storage` setup! Let's setup the database:: - - swh-db-init storage -d softwareheritage-dev - -``softwareheritage-dev`` is the name of the DB that will be created, it should -match the ``db`` line in ``storage.yml`` - -To check that you can successfully connect to the DB (you should not be asked -for a password):: - - psql softwareheritage-dev - -You can now run the storage server like this:: - - python3 -m swh.storage.api.server --host localhost --port 5002 ~/.config/swh/storage/storage.yml - - -Step 4 --- ingest repositories ------------------------------- - -You are now ready to ingest your first repository into your local Software -Heritage. For the sake of example, we will ingest a few Git repositories. The -module in charge of ingesting Git repositories is the *Git loader*, Python -module ``swh.loader.git``. Its configuration file is at -``~/.config/swh/loader/git.yml``. Create it with a content like: - -.. code-block:: yaml - - storage: - cls: remote - args: - url: http://localhost:5002 - -It just informs the Git loader to use the storage server running on your -machine. The ``url`` line should match the command line used to run the storage -server. - -You can now ingest Git repository on the command line using the command:: - - python3 -m swh.loader.git.loader --origin-url GIT_CLONE_URL - -For instance, you can try ingesting the following repositories, in increasing -size order (note that the last two might take a few hours to complete and will -occupy several GB on both the Postgres DB and the object storage):: - - python3 -m swh.loader.git.loader --origin-url https://github.com/SoftwareHeritage/swh-storage.git - python3 -m swh.loader.git.loader --origin-url https://github.com/hylang/hy.git - python3 -m swh.loader.git.loader --origin-url https://github.com/ocaml/ocaml.git - - # WARNING: next repo is big - python3 -m swh.loader.git.loader --origin-url https://github.com/torvalds/linux.git - -Congratulations, you have just archived your first source code repositories! - -To re-archive the same repositories later on you can rerun the same commands: -only *new* objects added since the previous visit will be archived upon the -next one. - - -Step 5 --- browse the archive ------------------------------ - -You can now setup a local web app to browse what you have locally archived. The -web app uses the configuration file ``~/.config/swh/web/web.yml``. Create it -and fill it with something like: - -.. code-block:: yaml - - storage: - cls: remote - args: - url: http://localhost:5002 - -Nothing new here, the configuration just references the local storage server, -which have been used before for repository ingestion. - -You can now run the web app, and browse your local archive:: - - make run-django-webpack-devserver - xdg-open http://localhost:5004 - -Note that the ``make`` target will first compile a `webpack -`_ with various web assets and then launch the web app; -for webpack compilation you will need the Node.js dependencies discussed above. - -As an initial tour of the web app, try searching for one of the repositories -you have ingested (e.g., entering the ``hylang`` or ``ocaml`` keywords in the -search bar). Clicking on the repository name you will be brought back in time, -and you will be able to browse the source code and development history you have -archived. - -Enjoy!