Page MenuHomeSoftware Heritage

Phabricator: Fix loading tasks creation
ClosedPublic

Authored by anlambert on Sep 13 2019, 12:06 AM.

Details

Summary

I thought having easily fix the issue reported in T1997 but while trying to implement a test
to validate the changes, I noticed there was a major issue in the current lister implementation.

Turns out except the first listed repository (whose index is retrieved through a bootstrap process)
all other listed ones were filtered out. Consequently, no entries in the listers database and no
scheduler loading tasks were created when listing a Phabricator instance.

I now better understand why the Phabricator instances we listed so far in production still do not
have any visit (see here for instance).

This critical issue is due to the max_index member overriding of the Phabricator lister
when it bootstraps to get the first repository index. Unfortunately, the tests were passing
as the conditions were still respected (but with only one repository ingested in listers database).

That diff fixes that nasty bug and also the issue observed in T1997 (less critical as it will
only appear when listing repositories when the max_bound parameter is provided to the run
method of the lister, the fact that we encountered it is due to the bug described above).
I also fix some typos to make my git pre-commit hook happy.

Test Plan

Tests implementation have been improved and new ones have been added to capture the observed issues.

Diff Detail

Repository
rDLS Listers
Branch
phabricator-lister-fix-missing-tasks
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 7794
Build 11206: tox-on-jenkinsJenkins
Build 11205: arc lint + arc unit

Event Timeline

anlambert edited the summary of this revision. (Show Details)

Update after more tests in the docker environment. While this changes clearly improve the effective creation of loading tasks for listed Phabricator repositories, there are still a couple that do not get loaded into the archive. For instance, when listing our forge, the following origins: https://forge.softwareheritage.org/source/puppet-environment.git and https://forge.softwareheritage.org/source/puppet-puppetlabs-puppetdb.git do not have any loading task executed. It seems they are deleted from the scheduler database by the [[ https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/core/indexing_lister.py$172-194 | disable_deleted_repo_tasks ]] method.

So planning changes on this as further investigation are needed to get a correct lister behavior.

ardumont edited the summary of this revision. (Show Details)

First, thanks for working on this ;)

... It seems they are deleted from the scheduler database by the disable_deleted_repo_tasks method.

If that's the case, it's because the loading tasks executed failed too many times. And thus are marked as disabled (probably by the scheduler-listener).
And thus are cleaned up.

See [1] for a potential reason.

So planning changes on this as further investigation are needed to get a correct lister behavior.

So, I'm not sure this is related to what you already fixed (T1997).
Thus, i think this could be merged (and open a new task with the current problem described there).
What do you think?


[1] I initially thought of a peculiar settings in the repositories (for example [2]) you mentioned.
I did not see anything strikingly wrong though (uris exist with at least read access...)

| URI                                                                      | I/O        | Display |
|--------------------------------------------------------------------------+------------+---------|
| https://forge.softwareheritage.org/source/puppet-environment.git         | Read Only  | Visible |
| https://forge.softwareheritage.org/source/puppet-puppetlabs-puppetdb.git | Read/Write | Visible |

Still, might be the clone uris selection policy [3] for that lister needs improvments (adding more checks if possibly for visibility/read-only access or something)?

[2] https://forge.softwareheritage.org/source/puppet-puppetlabs-puppetdb/manage/uris/

[3] https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/phabricator/lister.py$129-153

If that's the case, it's because the loading tasks executed failed too many times. And thus are marked as disabled (probably by the scheduler-listener).
And thus are cleaned up.

The issue comes in fact from the way the Phabricator diffusion API is listing repositories. Instead of listing repositories from a given index, the API returns repositories
whose indices are strictly greater than the index provided as parameter to the search endpoint.
Because of that behavior, the disable_deleted_repo_task method will filter out the last listed repository in each page returned by the Phabricator API
because the start index provided as parameter corresponds to the latest injected repository from the previous page returned by the API.

I also noticed that if you list two different Phabricator instances and the intersection between their repositories ids is not null, some tasks from the first listed forge
will be deleted when listing the second as the underlying database query does not filter on the instance column first.

So, I'm not sure this is related to what you already fixed (T1997).
Thus, i think this could be merged (and open a new task with the current problem described there).
What do you think?

Agree, I will create new tasks describing these issues (T1999, T2000) and one diff per task.

Update the diff to only address the issue described in T1999

This revision is now accepted and ready to land.Sep 16 2019, 1:24 PM

Update task reference in commit message

anlambert retitled this revision from Phabricator lister: Fix some critical issues to Phabricator lister: Fix loading tasks creation.Sep 16 2019, 1:39 PM
anlambert retitled this revision from Phabricator lister: Fix loading tasks creation to Phabricator: Fix loading tasks creation.Sep 16 2019, 8:29 PM
This revision was automatically updated to reflect the committed changes.