Page MenuHomeSoftware Heritage

Flaky test in swh-graph
Closed, MigratedEdits Locked

Description

The following swh-graph test often ends up flaky, see swh-environment Jenkins job builds history.

python3 -m pytest  .
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/jenkins/workspace/DENV/tests/swh-graph, configfile: pytest.ini
plugins: asyncio-0.16.0, hypothesis-6.34.1, dash-2.0.0, mock-3.6.1, django-test-migrations-1.2.0, xdist-2.5.0, django-4.5.2, postgresql-2.6.1, flask-1.2.0, requests-mock-1.9.3, redis-2.3.0, forked-1.4.0, swh.core-1.0.1.dev1+g4ff374f, swh.journal-0.9.2.dev2+g4e5e009
collected 43 items

swh/graph/tests/test_api_client.py .........sF...s.........s....s        [ 69%]
swh/graph/tests/test_cli.py .                                            [ 72%]
swh/graph/tests/test_swhid.py ............                               [100%]

=================================== FAILURES ===================================
_____________________ test_random_walk_dst_is_type[remote] _____________________

graph_client = <RemoteGraphClient url=http://127.0.0.1:39123/graph/>

    def test_random_walk_dst_is_type(graph_client):
        """as the walk is random, we test a visit from a cnt node to the only
        origin in the dataset, and only check the final node of the path
        (i.e., the origin)
        """
        args = ("swh:1:cnt:0000000000000000000000000000000000000001", "ori")
        kwargs = {"direction": "backward"}
        expected_root = "swh:1:ori:0000000000000000000000000000000000000021"
    
        actual = list(graph_client.random_walk(*args, **kwargs))
        assert len(actual) > 1  # no origin directly links to a content
        assert actual[0] == args[0]
        assert actual[-1] == expected_root
    
        kwargs2 = kwargs.copy()
        kwargs2["limit"] = -1
        actual = list(graph_client.random_walk(*args, **kwargs2))
        assert actual == [expected_root]
    
        kwargs2["limit"] = -2
        actual = list(graph_client.random_walk(*args, **kwargs2))
        assert len(actual) == 2
        assert actual[-1] == expected_root
    
        kwargs2["limit"] = 3
        actual = list(graph_client.random_walk(*args, **kwargs2))
>       assert len(actual) == 3
E       AssertionError: assert 1 == 3
E        +  where 1 = len([''])

swh/graph/tests/test_api_client.py:281: AssertionError
=========================== short test summary info ============================
FAILED swh/graph/tests/test_api_client.py::test_random_walk_dst_is_type[remote]
=================== 1 failed, 38 passed, 4 skipped in 8.69s ====================
make: *** [../Makefile.python:20: test] Error 1

Event Timeline

anlambert triaged this task as Normal priority.Jan 4 2022, 1:54 PM
anlambert created this task.

I made a temporary fix in D6893, it doesn't solve the underlying issue but greatly decreases the probability of it happening. I'm not quite sure what would be a proper test for this endpoint, but this is at least enough to fix this issue in particular.

seirl changed the task status from Open to Work in Progress.Jan 7 2022, 4:37 PM
seirl moved this task from Backlog to In progress on the Compressed graph service board.
In T3831#76627, @seirl wrote:

I made a temporary fix in D6893, it doesn't solve the underlying issue but greatly decreases the probability of it happening. I'm not quite sure what would be a proper test for this endpoint, but this is at least enough to fix this issue in particular.

Would not it be simpler to check if the expected nodes list length in the test is lesser or equal than the limit parameter value ?

No, we want to check that random_walk can reach its actual destination.