Page MenuHomeSoftware Heritage

cvsclient: Handle error in fetch_rlog when path does not exist
ClosedPublic

Authored by anlambert on Oct 13 2022, 3:40 PM.

Details

Summary

When attempting to fetch the rlog for a path that does not exist in
the repository, the CVS server will respond with the following lines:

E cvs rlog: could not read RCS file for <path>
ok

That error case was not handled in fetch_rlog so ensure it returns None
when encountering it.

The issue was spotted when the loader attempts to fetch more rlog data from
Attic directories. The paths of these Attic directories are computed from
those of the files in the repositories but it exist cases where those
directories do not exist.

It fixes the loading of that repository: ssh://anoncvs@anoncvs.NetBSD.org/cvsroot/htdocs.
Code below triggers that issue prior that patch.

from urllib.parse import urlparse
from swh.loader.cvs.cvsclient import CVSClient

client = CVSClient(urlparse("ssh://anoncvs@anoncvs.NetBSD.org/cvsroot/htdocs"))
client.fetch_rlog(b"htdocs/docs/Hardware/Busses/Attic")

Diff Detail

Repository
rDLDCVS CVS Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8675 (id=31331)

Rebasing onto 83fae9c7b3...

Current branch diff-target is up to date.
Changes applied before test
commit 6902b6321a27e3a6a9a54582104d9593fc1d3d97
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Oct 13 15:21:04 2022 +0200

    cvsclient: Handle error in fetch_rlog when path does not exist
    
    When attempting to fetch the rlog for a path that does not exist in
    the repository, the CVS server will respond with the following lines:
    
    E cvs rlog: could not read RCS file for <path>
    ok
    
    That error case was not handled in fetch_rlog so ensure it returns None
    when encountering it.
    
    The issue was spotted when the loader attempts to fetch more rlog data from
    Attic directories. The paths of these Attic directories are computed from
    those of the files in the repositories but it exist cases where those
    directories do not exist.

See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/126/ for more details.

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/loader/cvs/cvsclient.py
342
This revision is now accepted and ready to land.Oct 13 2022, 6:39 PM

Build is green

Patch application report for D8675 (id=31350)

Rebasing onto 83fae9c7b3...

Current branch diff-target is up to date.
Changes applied before test
commit 3d019fed7bf255d2b84a254c83339f405f32b05f
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Oct 13 15:21:04 2022 +0200

    cvsclient: Handle error in fetch_rlog when path does not exist
    
    When attempting to fetch the rlog for a path that does not exist in
    the repository, the CVS server will respond with the following lines:
    
    E cvs rlog: could not read RCS file for <path>
    ok
    
    That error case was not handled in fetch_rlog so ensure it returns None
    when encountering it.
    
    The issue was spotted when the loader attempts to fetch more rlog data from
    Attic directories. The paths of these Attic directories are computed from
    those of the files in the repositories but it exist cases where those
    directories do not exist.

See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/128/ for more details.

Remove not used test archive

Build is green

Patch application report for D8675 (id=31351)

Rebasing onto 83fae9c7b3...

Current branch diff-target is up to date.
Changes applied before test
commit 356dfa27f71db105bb6b64e08814f97860aeee54
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Oct 13 15:21:04 2022 +0200

    cvsclient: Handle error in fetch_rlog when path does not exist
    
    When attempting to fetch the rlog for a path that does not exist in
    the repository, the CVS server will respond with the following lines:
    
    E cvs rlog: could not read RCS file for <path>
    ok
    
    That error case was not handled in fetch_rlog so ensure it returns None
    when encountering it.
    
    The issue was spotted when the loader attempts to fetch more rlog data from
    Attic directories. The paths of these Attic directories are computed from
    those of the files in the repositories but it exist cases where those
    directories do not exist.

See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/129/ for more details.