It is the implementation of GNU Lister, this lister download the tree.json.gz file from https://ftp.gnu.org/tree.json.gz, reades its json content and returns the origin of repos py parsing over the json data.
Related T1722
Differential D1482
GNU Lister nahimilega on May 17 2019, 12:30 PM. Authored by
Details
It is the implementation of GNU Lister, this lister download the tree.json.gz file from https://ftp.gnu.org/tree.json.gz, reades its json content and returns the origin of repos py parsing over the json data. Related T1722
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes
Comment Actions If that's blocking, do not hesitate to ask question in irc in that regards. Other can help in that matter.
Comment Actions Build is green Comment Actions
For the docker setup, P414 should be enough to run the gnu lister (providing your repository is in the branch with the gnu lister code). Amending the conf/lister.yml file to add the entries: celery: task_broker: amqp://guest:guest@amqp// task_modules: ... - swh.lister.gnu.tasks task_queues: ... - swh.lister.gnu.tasks.GNUListerTask Comment Actions
Thanks @ardumont. This part is not present in any documentation. I guess we can add a section on how to run lister in docker environmnet under lister tutorial section. Comment Actions
It's not. Indeed, adding a tutorial on how to add a new lister is a good idea. I'd:
Maybe you could update D1441 with such information, what do you think? Comment Actions
It would be a good idea Comment Actions To be clear, i'm fine with the diff now. Cheers, Comment Actions Thanks @ardumont, as you mentioned I did follow those steps to run it in docker, although something went wrong with the docker container and I have to reinstall whole of the docker in my pc, hence I was not able to test this lister yet, I think I will fix the docker issues in my pc by the end of the day, and then I can try to run this. Comment Actions Build is green Comment Actions
Sure thing. For the problem mentionned in irc, i replied to you there, here is my take on this: 20:05 <+ardumont> in general, don't only rely on documentation as this can go out of sync 20:05 <+ardumont> take a look also at the code 20:05 <+ardumont> for the docker-dev, that'd the docker-compose file 20:05 <+ardumont> archit_agrawal[m: ^ 20:10 <archit_agrawal[m> ardumont: I will surely take a look at docker-compose file 20:12 <archit_agrawal[m> ardumont: as you told, I amended conf/lister.yml with gnu lister, now how shall I proceed further to sucessfully run the lister 20:28 <archit_agrawal[m> ardumont: Do I have to run the way mentioned in readme of swh-lister ? 21:45 <archit_agrawal[m> I am trying to run gnu lister in docker . I am getting ModuleNotFoundError: No module named 'psycopg2.errors' error, can anyone please help me. 21:45 <archit_agrawal[m> https://forge.softwareheritage.org/P419 21:45 -- Notice(swhbot): P419 (author: nahimilega): request 400 from scheduler <https://forge.softwareheritage.org/P419> 22:18 <kalpitk[m]> I think 'pip install psycopg2' inside virtual env will be enough 22:34 <archit_agrawal[m> kalpitk: It is already installed in virtual env 22:36 <+pinkieval> archit_agrawal[m: is the scheduler running in the venv? 22:38 <archit_agrawal[m> pinkieval: yes 22:39 <+pinkieval> can you paste its logs? 22:40 <archit_agrawal[m> pinkieval: https://forge.softwareheritage.org/P420 docker-compose ps outpur 22:40 -- Notice(swhbot): P420 (author: nahimilega): docker-compose ps output <https://forge.softwareheritage.org/P420> 22:42 <+pinkieval> if it's running in docker, then it's not running in the venv 22:42 <+pinkieval> and that's not its logs 22:43 <+pinkieval> "docker-compose logs swh-scheduler-api" 22:43 <archit_agrawal[m> pinkieval: https://forge.softwareheritage.org/P421 22:43 -- Notice(swhbot): P421 (author: nahimilega): scheduler api logs <https://forge.softwareheritage.org/P421> 22:44 <archit_agrawal[m> >and that's not its logs, I sent the previous message before I received this message 22:47 <+pinkieval> hmm, it has no issue referring to psycopg2 22:47 <archit_agrawal[m> pinkieval: >can you paste its logs? :I sent the previous message before I received this message 22:47 <+pinkieval> so the error is coming from the unpickling 22:48 <+pinkieval> python -c "import psycopg2.errors" 22:48 <+pinkieval> does this work? 22:49 <archit_agrawal[m> ModuleNotFoundError: No module named 'psycopg2.errors' 22:49 <archit_agrawal[m> No ---- 09:38 <+ardumont> archit_agrawal[m: pinkieval: there might be 2 errors involved, one triggering the other 09:39 <+ardumont> the first one being there is probably no scheduler task-type gnu-lister referenced in the scheduler 09:39 <+ardumont> thus, when the lister asks for creating that kind of task, it's not happy about it 09:39 <+ardumont> and then the error we see here about psycopg2.error module not found 09:43 <+ardumont> archit_agrawal[m: prior to triggering your gnu lister task in your docker-env, you need to add the associated task-type 09:43 <+ardumont> swh scheduler task-type add --help Comment Actions
@ardumont Thanks for your help, I ran the lister in docker, and it created scheduler task as Task 51 Next run: in 3 months (2019-09-05 09:46:26+00:00) Interval: 90 days, 0:00:00 Type: load-gnu Policy: recurring Status: next_run_not_scheduled Priority: Args: 'apl' 'https://ftp.gnu.org/gnu/apl/' Keyword args: tarballs: None I am not able to get why this tarballs Keyword args: is none, it there some error in the code? Comment Actions
I got the error Comment Actions Build is green Comment Actions @ardumont I checked it in docker, now it is working fine. Task 765 Next run: in 3 months (2019-09-05 11:18:21+00:00) Interval: 90 days, 0:00:00 Type: load-gnu Policy: recurring Status: next_run_not_scheduled Priority: Args: 'libiconv' 'https://ftp.gnu.org/old-gnu/libiconv/' Keyword args: tarballs: [{'date': '985114279', 'archive': 'https://ftp.gnu.org/old-gnu/libiconv/libiconv-1.6.1.tar.gz'}, {'date': '1054061763', 'archive': 'https://ftp.gnu.org/old-gnu/libiconv/libiconv-1.9.1.bin.woe32.zip'}, {'date': '1053376580', 'archive': 'https://ftp.gnu.org/old-gnu/libiconv/libiconv-1.9.bin.woe32.zip'}, {'date': '1053376846', 'archive': 'https://ftp.gnu.org/old-gnu/libiconv/libiconv-1.9.tar.gz'}] Comment Actions
Nice work on making it work! Just a couple of questions, see before this comment.
Comment Actions Build is green
Comment Actions Related P422 Status so far: If 'tarballs' removed from model, this explodes. my take on this:
Comment Actions One way to avoid including tarballs in model is to make a variable instance of class named tarballs (like LISTER_NAME or TREE_URL), which would countain all the tarballs of each package and can be accessed from task_dict() function Comment Actions
Yes, please go that way. I'm not so keen on that solution because i prefer the code being stateless as much as possible (in that context, that means letting state pass through method/function parameters instead of relying on state variables to do neat tricks).
Cheers, Comment Actions Build is green Comment Actions Build is green Comment Actions I tested the lister with new changes in the docker container, it worked fine. Here is one of the loader task it created. Task 15940 Next run: seconds ago (2019-06-08 18:02:07+00:00) Interval: 90 days, 0:00:00 Type: load-gnu Policy: recurring Status: next_run_scheduled Priority: Args: 'java2html' 'https://ftp.gnu.org/old-gnu/java2html/' Keyword args: tarballs: [{'date': '944729610', 'archive': 'https://ftp.gnu.org/old-gnu/java2html/java2html-1.3.1.tar.gz'}, {'date': '947003574', 'archive': 'https://ftp.gnu.org/old-gnu/java2html/java2html-1.4.tar.gz'}, {'date': '953974733', 'archive': 'https://ftp.gnu.org/old-gnu/java2html/java2html-1.5.tar.gz'}, {'date': '977303005', 'archive': 'https://ftp.gnu.org/old-gnu/java2html/java2html-1.6.tar.gz'}, {'date': '979403803', 'archive': 'https://ftp.gnu.org/old-gnu/java2html/java2html-1.7.tar.gz'}] Comment Actions Awesome. Almost there. Also, note that this is the full gnu lister.
Comment Actions Build is green Comment Actions So i'm mostly good with this. Real awesome that you made it work with the docker-env, i'm looking forward for the update on D1441 with what you had to do. Prior to merging this though, please try to clean up the test samples, keep them to a reasonable minimum (api_response.json, file_structure.json, etc...). There is no need to keep all extra files (the ones which are filtered out in the end: .sig, .ogg, ogv, ...). Cheers, Comment Actions Build is green Comment Actions If you do need to rebase, update the diff nonetheless (prior to push) so that phabricator sees the commits and close the diff itself. |