Page MenuHomeSoftware Heritage

Unstuck swh.storage debian build
Closed, MigratedEdits Locked

Description

Currently flip-flopping on either cassandra test failing or journal tests hanging.

Deactivating cassandra tests, the build is consistently fine though.

Find the actual problem and make the build ok.

Event Timeline

ardumont triaged this task as Normal priority.Feb 16 2021, 11:49 AM
ardumont created this task.

Trying to activate further logs fails as it seems the computed path in the fixture is not correct.
(well, I did not see any logs with the build runs...)

From within the chroot, trying to start cassandra fails

JMX_PORT=1099
CASS_PATH=/tmp/pytest-of-tony/pytest-current/cassandra_conf0
CASS_BIN=/usr/sbin/cassandra
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
$CASS_BIN -Dcassandra.config=file://$CASS_PATH/cassandra.yaml \
          -Dcassandra.logdir=$CASS_PATH/cassandra_log \
          -Dcassandra.jmx.local.port=$JMX_PORT \
          -Dcassandra-foreground=yes

First discrepancy noticed is that the free binary is not found (from procps package).

(sid-amd64-sbuild)root@yavin4:/build/swh-storage-Ov81G6$ which free
(sid-amd64-sbuild)root@yavin4:/build/swh-storage-Ov81G6#

This is required by the script which starts cassandra:

(sid-amd64-sbuild)root@yavin4:/build/swh-storage-Ov81G6# grep free /etc/cassandra/cassandra-env.sh
            system_memory_in_mb=`free -m | awk '/:/ {print $2;exit}'`

(This is a red-herring in the end, see below)

Second discrepancy noticed is that both jdk 11 and 17 are installed.

(sid-amd64-sbuild)root@yavin4:/build/swh-storage-Ov81G6# dpkg -l | grep jdk
ii  openjdk-11-jre:amd64             11.0.10+9-1                  amd64        OpenJDK Java runtime, using Hotspot JIT
ii  openjdk-11-jre-headless:amd64    11.0.10+9-1                  amd64        OpenJDK Java runtime, using Hotspot JIT (headless)
ii  openjdk-17-jre:amd64             17~9-1                       amd64        OpenJDK Java runtime, using Hotspot JIT
ii  openjdk-17-jre-headless:amd64    17~9-1                       amd64        OpenJDK Java runtime, using Hotspot JIT (headless)

Starting it with java 17, it refused. Forcing cassandra to start with java 11, it
starts.

So possibly forcing the JAVA_HOME to /usr/lib/jvm/java-11-openjdk-amd64 would help.

Indeed running pytest without any JAVA_HOME configured, it fails.

(sid-amd64-sbuild)tony@yavin4:/build/swh-storage-Ov81G6/swh-storage-0.23.0$ python3 -m pytest -x -s swh/storage/tests/test_cassandra.py
============================================================================================================= test session starts =============================================================================================================
platform linux -- Python 3.9.1+, pytest-6.0.2, py-1.10.0, pluggy-0.13.0
rootdir: /build/swh-storage-Ov81G6/swh-storage-0.23.0, configfile: pytest.ini
plugins: hypothesis-5.43.3, swh.core-0.11.0, swh.journal-0.7.1, postgresql-2.2.0, mock-1.10.4
collected 149 items

swh/storage/tests/test_cassandra.py E

=================================================================================================================== ERRORS ====================================================================================================================
______________________________________________________________________________________________ ERROR at setup of TestCassandraStorage.test_types ______________________________________________________________________________________________

tmpdir_factory = TempdirFactory(_tmppath_factory=TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x7fe2eb3dd9d0>, _basetemp=PosixPath('/tmp/pytest-of-tony/pytest-1')))
...
        if not running:
>           raise Exception("cassandra process stopped unexpectedly.")
E           Exception: cassandra process stopped unexpectedly.

swh/storage/tests/test_cassandra.py:139: Exception

Trying it out from within the chroot:

(sid-amd64-sbuild)tony@yavin4:/build/swh-storage-Ov81G6/swh-storage-0.23.0$ export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
(sid-amd64-sbuild)tony@yavin4:/build/swh-storage-Ov81G6/swh-storage-0.23.0$ python3 -m pytest -x -s swh/storage/tests/test_cassandra.py
============================================================================================================= test session starts =============================================================================================================
platform linux -- Python 3.9.1+, pytest-6.0.2, py-1.10.0, pluggy-0.13.0
rootdir: /build/swh-storage-Ov81G6/swh-storage-0.23.0, configfile: pytest.ini
plugins: hypothesis-5.43.3, swh.core-0.11.0, swh.journal-0.7.1, postgresql-2.2.0, mock-1.10.4
collected 149 items

swh/storage/tests/test_cassandra.py .........................

[1] SWH_CASSANDRA_LOG

ardumont changed the task status from Open to Work in Progress.Feb 16 2021, 12:09 PM
ardumont moved this task from Backlog to in-progress on the System administration board.

What changed from the last unstable build ok [1] to the new failing one [2], the jdk
versions pulled for the build changed.

[1]

ok:

12:15:59  Selecting previously unselected package openjdk-8-jre-headless:amd64.
12:15:59  Preparing to unpack .../034-openjdk-8-jre-headless_8u275-b01-1_amd64.deb ...
12:15:59  Unpacking openjdk-8-jre-headless:amd64 (8u275-b01-1) ...
...
12:16:01  Selecting previously unselected package openjdk-11-jre-headless:amd64.
12:16:01  Preparing to unpack .../040-openjdk-11-jre-headless_11.0.10+9-1_amd64.deb ...
12:16:01  Unpacking openjdk-11-jre-headless:amd64 (11.0.10+9-1) ...

[2] https://jenkins.softwareheritage.org/job/debian/job/packages/job/DSTO/job/gbp-buildpackage/330/consoleFull

Failing:

11:04:04  Selecting previously unselected package openjdk-11-jre:amd64.
11:04:04  Preparing to unpack .../129-openjdk-11-jre_11.0.10+9-1_amd64.deb ...
11:04:04  Unpacking openjdk-11-jre:amd64 (11.0.10+9-1) ...
11:04:04  Selecting previously unselected package openjdk-17-jre:amd64.
11:04:04  Preparing to unpack .../130-openjdk-17-jre_17~9-1_amd64.deb ...
11:04:04  Unpacking openjdk-17-jre:amd64 (17~9-1) ...

Forcing JAVA_HOME to a jdk11

head -3 debian/rules
#!/usr/bin/make -f

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

build is now ok locally (through gbp):

...
+------------------------------------------------------------------------------+
| Summary                                                                      |
+------------------------------------------------------------------------------+

Build Architecture: amd64
Build Type: full
Build-Space: 6800
Build-Time: 405
Distribution: unstable-swh
Host Architecture: amd64
Install-Time: 122
Job: /home/tony/debian/build-area/swh-storage_0.23.0-1~swh1.dsc
Machine Architecture: amd64
Package: swh-storage
Package-Time: 534
Source-Version: 0.23.0-1~swh1
Space: 6800
Status: successful
Version: 0.23.0-1~swh1
--------------------------------------------------------------------------------
Finished at 2021-02-16T11:14:25Z
Build needed 00:08:54, 6800k disk space

Another solution (to prevent hard-coding JAVA_HOME) is to invert the dependency order
currently defined in debian/rules.

Making the openjdk-11-jre before the cassandra one...

As inferred from the tryouts [1], this impacts the build.

Having the dependency cassandra before (or alone), this pulls the openjdk-17-jre which
if installed becomes the defaut java used to start cassandra (and fails).

[1] P951#6331

Another suggestion which sounds more standard, debian build wise:

13:05 <+olasd> I think you can just add a `Build-Conflicts: openjdk-17-jre-headless`
13:06 <+olasd> which should make the sbuild dependency resolver avoid it altogether
13:10 <+ardumont> ack, i'll try

Tryout in progress

Tryout in progress

Failed though.

(sid-amd64-sbuild)root@yavin4:/build/swh-storage-WEfUH0# which java
/usr/bin/java
(sid-amd64-sbuild)root@yavin4:/build/swh-storage-WEfUH0# java -version
openjdk version "16-ea" 2021-03-16
OpenJDK Runtime Environment (build 16-ea+35-Debian-1)
OpenJDK 64-Bit Server VM (build 16-ea+35-Debian-1, mixed mode, sharing)

For the previous tryout to work, we need to exclude all unwanted jres.

With debian/control:

Build-Conflicts: openjdk-17-jre-headless,
                 openjdk-16-jre-headless,
                 openjdk-15-jre-headless,

Build is fine:

+------------------------------------------------------------------------------+
| Summary                                                                      |
+------------------------------------------------------------------------------+

Build Architecture: amd64
Build Type: full
Build-Space: 6812
Build-Time: 423
Distribution: unstable-swh
Host Architecture: amd64
Install-Time: 89
Job: /home/tony/debian/build-area/swh-storage_0.23.0-1~swh1.dsc
Machine Architecture: amd64
Package: swh-storage
Package-Time: 517
Source-Version: 0.23.0-1~swh1
Space: 6812
Status: successful
Version: 0.23.0-1~swh1
--------------------------------------------------------------------------------
Finished at 2021-02-16T13:28:36Z
Build needed 00:08:37, 6812k disk space

Status, unstable build is now ok.

Now stable build is stuck on storage tests, the one using a journal.
Locally reproduced... but nowhere near having a chroot to debug onto yet... waiting for the timeout...

[1] https://jenkins.softwareheritage.org/job/debian/job/packages/job/DSTO/job/gbp-buildpackage/333/consoleFull

Now stable build is stuck on storage tests, the one using a journal.

D5085 for the "master" branch fix.

Package swh.storage 0.23.1 built both for stable and unstable.

ardumont claimed this task.
ardumont moved this task from in-progress to done on the System administration board.