Page MenuHomeSoftware Heritage

Remove columns 'description' and 'origin_id'.
ClosedPublic

Authored by vlorentz on Jun 18 2019, 5:09 PM.

Details

Summary

they are never read.

origin_id is not even set by listers, and description is only set by half of them, so even if we had a use case for it, we would have to repopulate the table before using it.

And description belongs in the extrinsic metadata workflow anyway.

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

They are useless.

Yes, well, can you please develop a bit more?

origin-id ok, because we are moving away from it ;)

description seems mostly useless as it's not populated (not in bitbucket apparently though).

I would have argued that we also have a bit much of urls but...
in github, origin_url and html_url are not populated from the same field...
so it stays ;)


Also, can you add the migration sql in a paste and attach to this diff?

That way, when we migrate, the job is simplified ;)

Cheers,

For the sake of discussion ;)

This revision now requires changes to proceed.Jun 19 2019, 9:03 AM

They are useless.

Yes, well, can you please develop a bit more?

they are never read. origin_id is not even set by listers, and description is only set by half of them, so even if we had a use case for it, we would have to repopulate the table before using it.
And description belongs in the extrinsic metadata workflow anyway.

Also, can you add the migration sql in a paste and attach to this diff?

alter table gnu_repo drop column origin_id;
alter table gnu_repo drop column description;

alter table phabricator_repo drop column origin_id;
alter table phabricator_repo drop column description;

alter table pypi_repo drop column origin_id;
alter table pypi_repo drop column description;

alter table bitbucket_repo drop column origin_id;
alter table bitbucket_repo drop column description;

-- if we created that table already:
alter table cran_repo drop column origin_id;
alter table cran_repo drop column description;

alter table github_repo drop column origin_id;
alter table github_repo drop column description;

alter table npm_repo drop column origin_id;
alter table npm_repo drop column description;

alter table gitlab_repo drop column origin_id;
alter table gitlab_repo drop column description;

they are never read. origin-id is not even set by listers, and description is only set by half of them, so even if we had a use case for it, we would have to repopulate the table before using it.
And description belongs in the extrinsic metadata workflow anyway.

Sold.

Next time, please explicit that directly in the diff description, that's its goal (i did it ;).

Still, note that some data present in the lister model are not only for code.
I think some data are still relevant for reading purposes (without having to cross data information other from multiple dbs ;).


Thanks for the sql script.

Related P440 (It'll be simpler to target as a paste.)

This revision is now accepted and ready to land.Jun 19 2019, 10:37 AM
This revision was automatically updated to reflect the committed changes.