When parsing pom files, we are only interested to extract a VCS URL
(git, hg, svn) in order to create associated loading tasks.
In that case, the groupId and artifactId are not used by the lister
so better removing their extraction, plus it will prevent errors when
those info are missing in pom files.
See for instance that error when listing jboss maven:
swh-lister_1 | [2022-04-29 09:02:04,598: INFO/ForkPoolWorker-1] Fetching URL https://repository.jboss.org/maven2/org/jboss/ejb3/jboss-ejb3-tutorial-enterprise_webapp/0.1.0/jboss-ejb3-tutorial-enterprise_webapp-0.1.0.pom with params {} swh-lister_1 | [2022-04-29 09:02:04,748: ERROR/ForkPoolWorker-1] Task swh.lister.maven.tasks.FullMavenLister[45b54b16-ed7a-4b9c-80a3-b8adb25b8fe0] raised unexpected: KeyError('groupId') swh-lister_1 | Traceback (most recent call last): swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 451, in trace_task swh-lister_1 | R = retval = fun(*args, **kwargs) swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/scheduler/task.py", line 61, in __call__ swh-lister_1 | result = super().__call__(*args, **kwargs) swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 734, in __protected_call__ swh-lister_1 | return self.run(*args, **kwargs) swh-lister_1 | File "/src/swh-lister/swh/lister/maven/tasks.py", line 16, in list_maven_full swh-lister_1 | return lister.run().dict() swh-lister_1 | File "/src/swh-lister/swh/lister/pattern.py", line 127, in run swh-lister_1 | for page in self.get_pages(): swh-lister_1 | File "/src/swh-lister/swh/lister/maven/lister.py", line 256, in get_pages swh-lister_1 | gid = project_d["groupId"] swh-lister_1 | KeyError: 'groupId'