diff --git a/docs/run_a_new_lister.rst b/docs/run_a_new_lister.rst
new file mode 100644
index 0000000..f2bbc8b
--- /dev/null
+++ b/docs/run_a_new_lister.rst
@@ -0,0 +1,90 @@
+
+:orphan:
+
+.. _run-lister-tutorial:
+
+Tutorial: run a lister within docker-dev in just a few steps
+=====================================================================
+
+It is a good practice to run your new lister in docker-dev. This provides an almost
+production-like environment. Testing the lister in docker dev prior to deployment
+reduces the chances of encountering errors when turning it for production.
+Here are the steps you need to follow to run a lister within your local environment.
+
+
+1. You must edit the docker-compose override file (`docker-compose.override.yml`).
+   following the sample provided ::
+
+        version: '2'
+
+        services:
+        swh-lister:
+            volumes:
+            - "$SWH_ENVIRONMENT_HOME/swh-lister:/src/swh-lister"
+
+   The file named `docker-compose.override.yml` will automatically be loaded by
+   ``docker-compose``.Having an override makes it possible to run a docker container
+   with some swh packages installed from sources instead of using the latest 
+   published packages from pypi. For more details, you may refer to README.md 
+   present in ``swh-docker-dev``.
+2. Follow the instruction mentioned under heading **Preparation steps** and 
+   **Configuration file sample** in README.md of swh-lister.
+3. Add in the lister configuration the new ``task_modules`` and ``task_queues``
+   entry for the your new lister. You need to amend the conf/lister.yml file to 
+   add the entries. Here is an example for GNU lister::
+
+    celery:
+      task_broker: amqp://guest:guest@amqp//
+      task_modules:
+        ...
+        - swh.lister.gnu.tasks
+      task_queues:
+        ...
+        - swh.lister.gnu.tasks.GNUListerTask
+
+4. Make sure to run ``storage (5002)`` and ``scheduler (5008)`` services locally.
+   You may use the following command to run docker::
+
+    ~/swh-environment/swh-docker-dev$ docker-compose up -d 
+
+5. Add the lister task-type in the scheduler.  For example, if you want to
+   add pypi lister task-type ::
+
+    ~/swh-environment$ swh scheduler task-type add list-gnu-full \
+        "swh.lister.gnu.tasks.GNUListerTask" "Full GNU lister" \
+        --default-interval '1 day' --backoff-factor 1
+
+  You can check all the task-type by::
+
+    ~/swh-environment$swh scheduler task-type list
+    Known task types:
+    list-bitbucket-incremental:
+      Incrementally list BitBucket
+    list-cran:
+      Full CRAN Lister
+    list-debian-distribution:
+      List a Debian distribution
+    list-github-full:
+      Full update of GitHub repos list
+    list-github-incremental:
+    ...
+
+  If your lister is creating new loading task not yet registered, you need
+  to register that task type as well.
+
+6. Run your lister with the help of scheduler cli. You need to add the task in 
+   the scheduler using its cli. For example, you need to execute this command
+   to run gnu lister ::
+ 
+     ~/swh-environment$ swh scheduler --url http://localhost:5008/ task add \
+      list-gnu-full --policy oneshot  
+
+After the execution of lister is complete, you can see the loading task created::
+    
+    ~/swh-environment/swh-lister$ swh scheduler task list
+
+You can also check the repositories listed by the lister from the database in 
+which the lister output is stored. To connect to the database::
+
+    ~/swh-environment/swh-docker-dev$ docker-compose exec swh-lister bash -c \
+       'psql swh-listers'
diff --git a/docs/tutorial.rst b/docs/tutorial.rst
index 6c656f2..8d91e86 100644
--- a/docs/tutorial.rst
+++ b/docs/tutorial.rst
@@ -1,425 +1,367 @@
 :orphan:
 
 .. _lister-tutorial:
 
 Tutorial: list the content of your favorite forge in just a few steps
 =====================================================================
 
 (the `original version
 <https://www.softwareheritage.org/2017/03/24/list-the-content-of-your-favorite-forge-in-just-a-few-steps/>`_
 of this article appeared on the Software Heritage blog)
 
 Back in November 2016, Nicolas Dandrimont wrote about structural code changes
 `leading to a massive (+15 million!) upswing in the number of repositories
 archived by Software Heritage
 <https://www.softwareheritage.org/2016/11/09/listing-47-million-repositories-refactoring-our-github-lister/>`_
 through a combination of automatic linkage between the listing and loading
 scheduler, new understanding of how to deal with extremely large repository
 hosts like `GitHub <https://github.com/>`_, and activating a new set of
 repositories that had previously been skipped over.
 
 In the post, Nicolas outlined the three major phases of work in Software
 Heritage's preservation process (listing, scheduling updates, loading) and
 highlighted that the ability to preserve the world's free software heritage
 depends on our ability to find and list the repositories.
 
 At the time, Software Heritage was only able to list projects on
 GitHub. Focusing early on GitHub, one of the largest and most active forge in
 the world, allowed for a big value-to-effort ratio and a rapid launch for the
 archive. As the old Italian proverb goes, "Il meglio è nemico del bene," or in
 modern English parlance, "Perfect is the enemy of good," right? Right. So the
 plan from the beginning was to implement a lister for GitHub, then maybe
 implement another one, and then take a few giant steps backward and squint our
 eyes.
 
 Why? Because source code hosting services don't behave according to a unified
 standard. Each new service requires dedicated development time to implement a
 new scraping client for the non-transferable requirements and intricacies of
 that service's API. At the time, doing it in an extensible and adaptable way
 required a level of exposure to the myriad differences between these services
 that we just didn't think we had yet.
 
 Nicolas' post closed by saying "We haven't carved out a stable API yet that
 allows you to just fill in the blanks, as we only have the GitHub lister
 currently, and a proven API will emerge organically only once we have some
 diversity."
 
 That has since changed. As of March 6, 2017, the Software Heritage **lister
 code has been aggressively restructured, abstracted, and commented** to make
 creating new listers significantly easier. There may yet be a few kinks to iron
 out, but **now making a new lister is practically like filling in the blanks**.
 
 Fundamentally, a basic lister must follow these steps:
 
 1. Issue a network request for a service endpoint.
 2. Convert the response into a canonical format.
 3. Populate a work queue for fetching and ingesting source repositories.
 
 Steps 1 and 3 are generic problems, so they can get generic solutions hidden
 away in the base code, most of which never needs to change. That leaves us to
 implement step 2, which can be trivially done now for services with a clean web
 APIs.
 
 In the new code, we've tried to hide away as much generic functionality as
 possible, turning it into set-and-forget plumbing between a few simple
 customized elements. Different hosting services might use different network
 protocols, rate-limit messages, or pagination schemes, but, as long as there is
 some way to get a list of the hosted repositories, we think that the new base
 code will make getting those repositories much easier.
 
 First, let me give you the 30,000 foot view…
 
 The old GitHub-specific lister code looked like this (265 lines of Python):
 
 .. figure:: images/old_github_lister.png
 
 By contrast, the new GitHub-specific code looks like this (34 lines of Python):
 
 .. figure:: images/new_github_lister.png
 
 And the new BitBucket-specific code is even shorter and looks like this (24 lines of Python):
 
 .. figure:: images/new_bitbucket_lister.png
 
 And now this is common shared code in a few abstract base classes, with some new features and loads of docstring comments (in red):
 
 .. figure:: images/new_base.png
 
 So how does the lister code work now, and **how might a contributing developer
 go about making a new one**
 
 The first thing to know is that we now have a generic lister base class and ORM
 model. A subclass of the lister base should already be able to do almost
 everything needed to complete a listing task for a single service
 request/response cycle with the following implementation requirements:
 
 1. A member variable must be declared called ``MODEL``, which is equal to a
    subclass (Note: type, not instance) of the base ORM model. The reasons for
    using a subclass is mostly just because different services use different
    incompatible primary identifiers for their repositories. The model
    subclasses are typically only one or two additional variable declarations.
 
 2. A method called ``transport_request`` must be implemented, which takes the
    complete target identifier (e.g., a URL) and tries to request it one time
    using whatever transport protocol is required for interacting with the
    service. It should not attempt to retry on timeouts or do anything else with
    the response (that is already done for you). It should just either return
    the response or raise a ``FetchError`` exception.
 
 3. A method called ``transport_response_to_string`` must be implemented, which
    takes the entire response of the request in (1) and converts it to a string
    for logging purposes.
 
 4. A method called ``transport_quota_check`` must be implemented, which takes
    the entire response of the request in (1) and checks to see if the process
    has run afoul of any query quotas or rate limits. If the service says to
    wait before making more requests, the method should return ``True`` and also
    the number of seconds to wait, otherwise it returns ``False``.
 
 5. A method called ``transport_response_simplified`` must be implemented, which
    also takes the entire response of the request in (1) and converts it to a
    Python list of dicts (one dict for each repository) with keys given
    according to the aforementioned ``MODEL`` class members.
 
 Because 1, 2, 3, and 4 are basically dependent only on the chosen network
 protocol, we also have an HTTP mix-in module, which supplements the lister base
 and provides default implementations for those methods along with optional
 request header injection using the Python Requests library. The
 ``transport_quota_check`` method as provided follows the IETF standard for
 communicating rate limits with `HTTP code 429
 <https://tools.ietf.org/html/rfc6585#section-4>`_ which some hosting services
 have chosen not to follow, so it's possible that a specific lister will need to
 override it.
 
 On top of all of that, we also provide another layer over the base lister class
 which adds support for sequentially looping over indices. What are indices?
 Well, some services (`BitBucket <https://bitbucket.org/>`_ and GitHub for
 example) don't send you the entire list of all of their repositories at once,
 because that server response would be unwieldy. Instead they paginate their
 results, and they also allow you to query their APIs like this:
 ``https://server_address.tld/query_type?start_listing_from_id=foo``. Changing
 the value of 'foo' lets you fetch a set of repositories starting from there. We
 call 'foo' an index, and we call a service that works this way an indexing
 service. GitHub uses the repository unique identifier and BitBucket uses the
 repository creation time, but a service can really use anything as long as the
 values monotonically increase with new repositories. A good indexing service
 also includes the URL of the next page with a later 'foo' in its responses. For
 these indexing services we provide another intermediate lister called the
 indexing lister. Instead of inheriting from :class:`SWHListerBase
 <swh.lister.core.lister_base.SWHListerBase>`, the lister class would inherit
 from :class:`SWHIndexingLister
 <swh.lister.core.indexing_lister.SWHIndexingLister>`. Along with the
 requirements of the lister base, the indexing lister base adds one extra
 requirement:
 
 1. A method called ``get_next_target_from_response`` must be defined, which
    takes a complete request response and returns the index ('foo' above) of the
    next page.
 
 So those are all the basic requirements. There are, of course, a few other
 little bits and pieces (covered for now in the code's docstring comments), but
 for the most part that's it. It sounds like a lot of information to absorb and
 implement, but remember that most of the implementation requirements mentioned
 above are already provided for 99% of services by the HTTP mix-in module. It
 looks much simpler when we look at the actual implementations of the two
 new-style indexing listers we currently have…
 
-An important aspect for making a new lister is its testing. To register the 
-celery tasks of your new lister, you need to add your lister in the main
-conftest.py (swh/lister/core/tests/conftest.py)
-
-After testing, it is suggested to run your new lister in docker as it provides
-good, almost-production like test. Here are the steps you need to follow to run
-a new lister in docker.
-
-1. You must write a docker-compose override file (`docker-compose.override.yml`).
-   An example is given in the `docker-compose.override.yml.example` file ::
-
-        version: '2'
-
-        services:
-        swh-lister:
-            volumes:
-            - "$SWH_ENVIRONMENT_HOME/swh-lister:/src/swh-lister"
-
-   The file named `docker-compose.override.yml` will automatically be loaded by
-   `docker-compose`. For more details, you may refer to README.md present in
-   swh-docker-dev.
-2. Follow the instruction mentioned under heading Preparation steps and 
-   Configuration file sample in README.md of swh-lister.  
-3. Make sure to run storage (5002) and scheduler (5008) services locally.
-   You can run them by the following command::
-
-    ~/swh-environment/swh-docker-dev$ docker-compose up -d swh-scheduler-api \
-      swh-storage
-4. Add the lister task-type in the scheduler.  For example, if you want to
-   add pypi lister task-type ::
-
-    ~/swh-environment$swh-scheduler task-type add list-pypi recurring \
-     "Full pypi lister"
-
-  You can check all the task-type by::
-
-    ~/swh-environment$swh scheduler task-type list
-    Known task types:
-    list-bitbucket-incremental:
-      Incrementally list BitBucket
-    list-cran:
-      Full CRAN Lister
-    list-debian-distribution:
-      List a Debian distribution
-    list-github-full:
-      Full update of GitHub repos list
-    list-github-incremental:
-    ...
-
-  If your lister is creating new loading task not yet registered, you need
-  to register that task type as well. Like for GNU lister::
-
-     ~/swh-environment$swh scheduler task-type add load-gnu-full recurring \
-      "GNU Loader"
-
-5. Run your lister with the help of scheduler cli.You need to add the task in 
-   the schedular using its cli. For example you need to execute this command
-   to run gnu lister ::
- 
-     ~/swh-environment$swh scheduler --url http://localhost:5008/ task add \
-      list-gnu-full --policy oneshot  
-
-After the execution of lister is complete you can see the loading task created.
-    ~/swh-environment/swh-lister$swh scheduler task list
+When developing a new lister, it's important to test. For this, add the tests 
+(check `swh/lister/*/tests/`) and register the celery tasks in the main 
+conftest.py (`swh/lister/core/tests/conftest.py`).
+
+Another important step is to actually run it within the 
+docker-dev (:ref:`run-lister-tutorial`). 
 
 This is the entire source code for the BitBucket repository lister::
 
     # Copyright (C) 2017 the Software Heritage developers
     # License: GNU General Public License version 3 or later
     # See top-level LICENSE file for more information
 
     from urllib import parse
     from swh.lister.bitbucket.models import BitBucketModel
     from swh.lister.core.indexing_lister import SWHIndexingHttpLister
 
     class BitBucketLister(SWHIndexingHttpLister):
         PATH_TEMPLATE = '/repositories?after=%s'
         MODEL = BitBucketModel
 
         def get_model_from_repo(self, repo):
             return {'uid': repo['uuid'],
                     'indexable': repo['created_on'],
                     'name': repo['name'],
                     'full_name': repo['full_name'],
                     'html_url': repo['links']['html']['href'],
                     'origin_url': repo['links']['clone'][0]['href'],
                     'origin_type': repo['scm'],
                     'description': repo['description']}
 
         def get_next_target_from_response(self, response):
             body = response.json()
             if 'next' in body:
                 return parse.unquote(body['next'].split('after=')[1])
             else:
                 return None
 
         def transport_response_simplified(self, response):
             repos = response.json()['values']
             return [self.get_model_from_repo(repo) for repo in repos]
 
 And this is the entire source code for the GitHub repository lister::
 
     # Copyright (C) 2017 the Software Heritage developers
     # License: GNU General Public License version 3 or later
     # See top-level LICENSE file for more information
 
     import time
     from swh.lister.core.indexing_lister import SWHIndexingHttpLister
     from swh.lister.github.models import GitHubModel
 
     class GitHubLister(SWHIndexingHttpLister):
 	PATH_TEMPLATE = '/repositories?since=%d'
 	MODEL = GitHubModel
 
 	def get_model_from_repo(self, repo):
 	    return {'uid': repo['id'],
 		    'indexable': repo['id'],
 		    'name': repo['name'],
 		    'full_name': repo['full_name'],
 		    'html_url': repo['html_url'],
 		    'origin_url': repo['html_url'],
 		    'origin_type': 'git',
 		    'description': repo['description']}
 
 	def get_next_target_from_response(self, response):
 	    if 'next' in response.links:
 		next_url = response.links['next']['url']
 		return int(next_url.split('since=')[1])
 	    else:
 		return None
 
 	def transport_response_simplified(self, response):
 	    repos = response.json()
 	    return [self.get_model_from_repo(repo) for repo in repos]
 
 	def request_headers(self):
 	    return {'Accept': 'application/vnd.github.v3+json'}
 
 	def transport_quota_check(self, response):
 	    remain = int(response.headers['X-RateLimit-Remaining'])
 	    if response.status_code == 403 and remain == 0:
 		reset_at = int(response.headers['X-RateLimit-Reset'])
 		delay = min(reset_at - time.time(), 3600)
 		return True, delay
 	    else:
 		return False, 0
 
 We can see that there are some common elements:
 
 * Both use the HTTP transport mixin (:class:`SWHIndexingHttpLister
   <swh.lister.core.indexing_lister.SWHIndexingHttpLister>`) just combines
   :class:`SWHListerHttpTransport
   <swh.lister.core.lister_transports.SWHListerHttpTransport>` and
   :class:`SWHIndexingLister
   <swh.lister.core.indexing_lister.SWHIndexingLister>`) to get most of the
   network request functionality for free.
 
 * Both also define ``MODEL`` and ``PATH_TEMPLATE`` variables. It should be
   clear to developers that ``PATH_TEMPLATE``, when combined with the base
   service URL (e.g., ``https://some_service.com``) and passed a value (the
   'foo' index described earlier) results in a complete identifier for making
   API requests to these services. It is required by our HTTP module.
 
 * Both services respond using JSON, so both implementations of
   ``transport_response_simplified`` are similar and quite short.
 
 We can also see that there are a few differences:
 
 * GitHub sends the next URL as part of the response header, while BitBucket
   sends it in the response body.
 
 * GitHub differentiates API versions with a request header (our HTTP
   transport mix-in will automatically use any headers provided by an
   optional request_headers method that we implement here), while
   BitBucket has it as part of their base service URL.  BitBucket uses
   the IETF standard HTTP 429 response code for their rate limit
   notifications (the HTTP transport mix-in automatically handles
   that), while GitHub uses their own custom response headers that need
   special treatment.
 
 * But look at them! 58 lines of Python code, combined, to absorb all
   repositories from two of the largest and most influential source code hosting
   services.
 
 Ok, so what is going on behind the scenes?
 
 To trace the operation of the code, let's start with a sample instantiation and
 progress from there to see which methods get called when. What follows will be
 a series of extremely reductionist pseudocode methods. This is not what the
 code actually looks like (it's not even real code), but it does have the same
 basic flow. Bear with me while I try to lay out lister operation in a
 quasi-linear way…::
 
     # main task
 
     ghl = GitHubLister(lister_name='github.com',
 		       api_baseurl='https://github.com')
     ghl.run()
 
 ⇓ (SWHIndexingLister.run)::
 
     # SWHIndexingLister.run
 
     identifier = None
     do
 	response, repos = SWHListerBase.ingest_data(identifier)
 	identifier = GitHubLister.get_next_target_from_response(response)
     while(identifier)
 
 ⇓ (SWHListerBase.ingest_data)::
 
     # SWHListerBase.ingest_data
 
     response = SWHListerBase.safely_issue_request(identifier)
     repos = GitHubLister.transport_response_simplified(response)
     injected = SWHListerBase.inject_repo_data_into_db(repos)
     return response, injected
 
 ⇓ (SWHListerBase.safely_issue_request)::
 
     # SWHListerBase.safely_issue_request
 
     repeat:
 	resp = SWHListerHttpTransport.transport_request(identifier)
 	retry, delay = SWHListerHttpTransport.transport_quota_check(resp)
 	if retry:
 	    sleep(delay)
     until((not retry) or too_many_retries)
     return resp
 
 ⇓ (SWHListerHttpTransport.transport_request)::
 
     # SWHListerHttpTransport.transport_request
 
     path = SWHListerBase.api_baseurl
 	 + SWHListerHttpTransport.PATH_TEMPLATE % identifier
     headers = SWHListerHttpTransport.request_headers()
     return http.get(path, headers)
 
 (Oh look, there's our ``PATH_TEMPLATE``)
 
 ⇓ (SWHListerHttpTransport.request_headers)::
 
     # SWHListerHttpTransport.request_headers
 
     override → GitHubLister.request_headers
 
 ↑↑ (SWHListerBase.safely_issue_request)
 
 ⇓ (SWHListerHttpTransport.transport_quota_check)::
 
     # SWHListerHttpTransport.transport_quota_check
 
     override → GitHubLister.transport_quota_check
 
 And then we're done. From start to finish, I hope this helps you understand how
 the few customized pieces fit into the new shared plumbing.
 
 Now you can go and write up a lister for a code hosting site we don't have yet!