Page MenuHomeSoftware Heritage

cvsclient: Fix pserver error: "protocol error: <path> is not absolute"
ClosedPublic

Authored by anlambert on Jun 17 2022, 5:02 PM.

Details

Summary

Some CVS servers (SourceForge and OSDN for instance) return an error if
the path sent with the "Directory" pserver request is not absolute.

So fix that issue to ensure loading of such CVS repositories.

Before that change:

(swh) anlambert@carnavalet:/tmp/cvs_test$ swh loader -C ~/.config/swh/loader/cvs.yml run cvs pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help
INFO:swh.loader.cvs.loader.CvsLoader:Load origin 'pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help' with type 'cvs'
ERROR:swh.loader.cvs.loader.CvsLoader:Exception in fetch_data:
Traceback (most recent call last):
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 555, in fetch_data
    data = next(self.swh_revision_gen)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 278, in process_cvs_changesets
    self.checkout_file_with_cvsclient(k, f, self.cvsclient)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 232, in checkout_file_with_cvsclient
    fp = cvsclient.checkout(path, f.rev, dirname, expand_keywords=True)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/cvsclient.py", line 387, in checkout
    raise CVSProtocolError("Error from CVS server: %s" % response)
swh.loader.cvs.cvsclient.CVSProtocolError: Error from CVS server: b'E protocol error: help is not absolute\n'
{'status': 'failed'} for origin 'pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help'


(swh) anlambert@carnavalet:/tmp/cvs_test$ swh loader -C ~/.config/swh/loader/cvs.yml run cvs pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki
INFO:swh.loader.cvs.loader.CvsLoader:Load origin 'pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki' with type 'cvs'
ERROR:swh.loader.cvs.loader.CvsLoader:Exception in fetch_data:
Traceback (most recent call last):
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 555, in fetch_data
    data = next(self.swh_revision_gen)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 278, in process_cvs_changesets
    self.checkout_file_with_cvsclient(k, f, self.cvsclient)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/loader.py", line 232, in checkout_file_with_cvsclient
    fp = cvsclient.checkout(path, f.rev, dirname, expand_keywords=True)
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/swh/loader/cvs/cvsclient.py", line 387, in checkout
    raise CVSProtocolError("Error from CVS server: %s" % response)
swh.loader.cvs.cvsclient.CVSProtocolError: Error from CVS server: b'E protocol error: wiki is not absolute\n'
{'status': 'failed'} for origin 'pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki'

After that change:

(swh) anlambert@carnavalet:/tmp/cvs_test$ swh loader -C ~/.config/swh/loader/cvs.yml run cvs pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help
INFO:swh.loader.cvs.loader.CvsLoader:Load origin 'pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help' with type 'cvs'

{'status': 'eventful'} for origin 'pserver://anonymous@a.cvs.sourceforge.net/cvsroot/yazoo/help'


(swh) anlambert@carnavalet:/tmp/cvs_test$ swh loader -C ~/.config/swh/loader/cvs.yml run cvs pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki
INFO:swh.loader.cvs.loader.CvsLoader:Load origin 'pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki' with type 'cvs'

{'status': 'eventful'} for origin 'pserver://anonymous@cvs.osdn.net/cvsroot/phpgwjp/wiki'
Test Plan

This looks related to the CVS server version used, I do not have an easy way to reproduce the issue in a test.

Diff Detail

Repository
rDLDCVS CVS Loader
Branch
not-absolute-path-fix
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 29928
Build 46786: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 46785: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8005 (id=28841)

Could not rebase; Attempt merge onto d52686be91...

Updating d52686b..0380b3b
Fast-forward
 swh/loader/cvs/cvsclient.py         | 20 +++++++++++++++-----
 swh/loader/cvs/tests/test_loader.py | 23 ++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 6 deletions(-)
Changes applied before test
commit 0380b3b47681a76d774e411cac7b4403b1fb7643
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jun 17 16:55:38 2022 +0200

    cvsclient: Fix pserver error: "protocol error: <path> is not absolute"
    
    Some CVS servers (SourceForge and OSDN for instance) return an error if
    the path sent with the "Directory" pserver request is not absolute.
    
    So fix that issue to ensure loading of such CVS repositories.

commit eeb3cee1e1ca41078622f1fab168622876edbc9f
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jun 17 16:36:34 2022 +0200

    cvsclient: Allow to connect to a pserver URL without password
    
    The CVS client was raising an error when trying to connect to such pserver
    URL: pserver://anonymous@cvs.example.org/cvsroot/project/module
    
    But numerous CVS pserver URLs that can be found in the wild (notably on
    SourceForge and OSDN) are in that form.
    
    So add support for such URL form in the CVS client.

See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/113/ for more details.

Build is green

Patch application report for D8005 (id=28843)

Could not rebase; Attempt merge onto d52686be91...

Updating d52686b..089e2fd
Fast-forward
 mypy.ini                            |  3 ---
 swh/loader/cvs/cvsclient.py         | 50 +++++++++++++++++++------------------
 swh/loader/cvs/loader.py            |  9 ++++---
 swh/loader/cvs/tests/test_loader.py | 23 ++++++++++++++++-
 4 files changed, 53 insertions(+), 32 deletions(-)
Changes applied before test
commit 089e2fd045232caa46e2e2d26f9ff6ba19f9d8d8
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jun 17 16:55:38 2022 +0200

    cvsclient: Fix pserver error: "protocol error: <path> is not absolute"
    
    Some CVS servers (SourceForge and OSDN for instance) return an error if
    the path sent with the "Directory" pserver request is not absolute.
    
    So fix that issue to ensure loading of such CVS repositories.

commit e382aeb0526618ef5a7345551166ec46137fae93
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Jun 17 16:36:34 2022 +0200

    cvsclient: Allow to connect to a pserver URL without password
    
    The CVS client was raising an error when trying to connect to such pserver
    URL: pserver://anonymous@cvs.example.org/cvsroot/project/module
    
    But numerous CVS pserver URLs that can be found in the wild (notably on
    SourceForge and OSDN) are in that form.
    
    So add support for such URL form in the CVS client.
    
    Also remove use of external dependency urllib3.util.parse_url and prefer
    to use urllib.parse.urlparse from standard Python library.

See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/115/ for more details.

Did you check this does not cause issues with other servers?

Did you check this does not cause issues with other servers?

Yes, @stsp tested the pserver implementation of the CVS loader with repositories hosted on GNU savannah and the loading of those still work as expected.

10:47 $ swh loader -C ~/.config/swh/loader/cvs.yml run cvs pserver://anonymous@cvs.savannah.nongnu.org/sources/cppsh/cppsh
INFO:swh.loader.cvs.loader.CvsLoader:Load origin 'pserver://anonymous@cvs.savannah.nongnu.org/sources/cppsh/cppsh' with type 'cvs'
Request: BEGIN AUTH REQUEST
/sources/cppsh
anonymous
A
END AUTH REQUEST


{'status': 'eventful'} for origin 'pserver://anonymous@cvs.savannah.nongnu.org/sources/cppsh/cppsh'
This revision is now accepted and ready to land.Jun 20 2022, 3:08 PM