When parsing pom files, we are only interested to extract a VCS URL
(git, hg, svn) in order to create associated loading tasks.
In that case, the groupId and artifactId are not used by the lister
so better removing their extraction, plus it will prevent errors when
those info are missing in pom files.
See for instance that error when listing jboss maven:
swh-lister_1 | [2022-04-29 09:02:04,598: INFO/ForkPoolWorker-1] Fetching URL https://repository.jboss.org/maven2/org/jboss/ejb3/jboss-ejb3-tutorial-enterprise_webapp/0.1.0/jboss-ejb3-tutorial-enterprise_webapp-0.1.0.pom with params {}
swh-lister_1 | [2022-04-29 09:02:04,748: ERROR/ForkPoolWorker-1] Task swh.lister.maven.tasks.FullMavenLister[45b54b16-ed7a-4b9c-80a3-b8adb25b8fe0] raised unexpected: KeyError('groupId')
swh-lister_1 | Traceback (most recent call last):
swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 451, in trace_task
swh-lister_1 | R = retval = fun(*args, **kwargs)
swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/scheduler/task.py", line 61, in __call__
swh-lister_1 | result = super().__call__(*args, **kwargs)
swh-lister_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 734, in __protected_call__
swh-lister_1 | return self.run(*args, **kwargs)
swh-lister_1 | File "/src/swh-lister/swh/lister/maven/tasks.py", line 16, in list_maven_full
swh-lister_1 | return lister.run().dict()
swh-lister_1 | File "/src/swh-lister/swh/lister/pattern.py", line 127, in run
swh-lister_1 | for page in self.get_pages():
swh-lister_1 | File "/src/swh-lister/swh/lister/maven/lister.py", line 256, in get_pages
swh-lister_1 | gid = project_d["groupId"]
swh-lister_1 | KeyError: 'groupId'