Page MenuHomeSoftware Heritage

Origin visit ids restart from 1 even if there is previous visits
Closed, DuplicatePublic

Description

  • before running a loader:
cqlsh:swh> select * from origin_visit where origin='https://github.com/slackhq/nebula';

 origin                            | visit | date                            | type
-----------------------------------+-------+---------------------------------+------
 https://github.com/slackhq/nebula |     1 | 2020-09-11 19:46:47.786000+0000 |  git
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:25:07.399000+0000 |  git
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:31.343000+0000 |  git
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:37.292000+0000 |  git

(4 rows)
cqlsh:swh> select * from origin_visit_status where origin='https://github.com/slackhq/nebula';

 origin                            | visit | date                            | metadata | snapshot                                   | status  | type
-----------------------------------+-------+---------------------------------+----------+--------------------------------------------+---------+------
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:41.486000+0000 |     null | 0xe907333ef8c9aa35d8e365d4bbb307823978ba95 |    full |  git
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:37.292000+0000 |     null |                                       null | created |  git
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:41.042000+0000 |     null | 0xdc896dcd8aa78a37c5d682aab6bfb4e7698905a7 |    full |  git
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:31.343000+0000 |     null |                                       null | created |  git
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:27:11.533000+0000 |     null | 0xcad454c1c45450eb1f1b7677ccf8a5d880b2ad2d |    full |  git
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:25:07.399000+0000 |     null |                                       null | created |  git
 https://github.com/slackhq/nebula |     1 | 2020-09-11 19:55:13.627000+0000 |     null | 0xbd9f7679721afc1692d4f80890f9f71f600d26e2 |    full |  git
 https://github.com/slackhq/nebula |     1 | 2020-09-11 19:46:47.786000+0000 |     null |                                       null | created |  git

(8 rows)
  • launching the loader (the loading failed but it's another story)
swh@6490bac3ba28:/$ time swh loader run git https://github.com/slackhq/nebula
INFO:swh.loader.git.loader.GitLoader:Load origin 'https://github.com/slackhq/nebula' with type 'git'
Enumerating objects: 3317, done.
Counting objects: 100% (1128/1128), done.
Compressing objects: 100% (508/508), done.
Total 3317 (delta 696), reused 915 (delta 601), pack-reused 2189
INFO:swh.loader.git.loader.GitLoader:Listed 293 refs for repo https://github.com/slackhq/nebula
...
  • origin visits status after:
cqlsh:swh> select * from origin_visit where origin='https://github.com/slackhq/nebula';

 origin                            | visit | date                            | type
-----------------------------------+-------+---------------------------------+------
 https://github.com/slackhq/nebula |     1 | 2021-08-19 13:56:01.241000+0000 |  git  <-------- the date has been updated
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:25:07.399000+0000 |  git
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:31.343000+0000 |  git
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:37.292000+0000 |  git

cqlsh:swh> select * from origin_visit_status where origin='https://github.com/slackhq/nebula';

 origin                            | visit | date                            | metadata | snapshot                                   | status  | type                                                                                                                          
-----------------------------------+-------+---------------------------------+----------+--------------------------------------------+---------+------                                                                                                                         
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:41.486000+0000 |     null | 0xe907333ef8c9aa35d8e365d4bbb307823978ba95 |    full |  git                                                                                                                          
 https://github.com/slackhq/nebula |     4 | 2021-08-15 20:36:37.292000+0000 |     null |                                       null | created |  git                                                                                                                          
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:41.042000+0000 |     null | 0xdc896dcd8aa78a37c5d682aab6bfb4e7698905a7 |    full |  git                                                                                                                          
 https://github.com/slackhq/nebula |     3 | 2021-07-21 17:01:31.343000+0000 |     null |                                       null | created |  git                                                                                                                          
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:27:11.533000+0000 |     null | 0xcad454c1c45450eb1f1b7677ccf8a5d880b2ad2d |    full |  git                                                                                                                          
 https://github.com/slackhq/nebula |     2 | 2021-06-20 15:25:07.399000+0000 |     null |                                       null | created |  git                                                                                                                          
 https://github.com/slackhq/nebula |     1 | 2021-08-19 13:59:17.433000+0000 |     null |                                       null |  failed |  git  <----------- these 2 lines are wrong
 https://github.com/slackhq/nebula |     1 | 2021-08-19 13:56:01.241000+0000 |     null |                                       null | created |  git  <-----------
 https://github.com/slackhq/nebula |     1 | 2020-09-11 19:55:13.627000+0000 |     null | 0xbd9f7679721afc1692d4f80890f9f71f600d26e2 |    full |  git                                                                                                                          
 https://github.com/slackhq/nebula |     1 | 2020-09-11 19:46:47.786000+0000 |     null |                                       null | created |  git                                                                                                                          

(10 rows)