Page MenuHomeSoftware Heritage

Add comments to tables dbversion, content, skipped_content and fetch_history
ClosedPublic

Authored by twitu on Jun 14 2019, 8:19 AM.

Diff Detail

Repository
rDSTO Storage manager
Branch
sql-table-column-comments
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 6239
Build 8625: tox-on-jenkinsJenkins
Build 8624: arc lint + arc unit

Event Timeline

This diff is in reference to T1527.

sha1_git is not "Git id of sha1 checksum", it's the hash of the content as a git object (ie. with the header used by git to identify blobs).

Could you also add comments on these tables, and use is instead of IS?

swh/storage/sql/30-swh-schema.sql
62

Not "file creation", but the first time we saw that content.

64

It's not an identifier of the "file content", just of the content

82–84

I don't think comments on these columns are relevant. Maybe just one of the type.

113

Same

140

This comment should go into more details

This revision now requires changes to proceed.Jun 14 2019, 3:28 PM
zack requested changes to this revision.Jun 14 2019, 3:28 PM
zack added a subscriber: zack.
zack added inline comments.
swh/storage/sql/30-swh-schema.sql
57–64

the style of these could be improved to be less awkward, e.g., "length of file content" -> "file content length"

also, please check our glossary, as it will give hints of how to make this both shorter and clearer, e.g., it's not "file content", it's just "content"

141–142

these are incorrect, you should just refer to the general notions of "standard output / standard error", the intended reader will know what they mean

swh/storage/sql/30-swh-schema.sql
57–64

I prefer using active voice for comments. But the comments will become repetitive like,

  • Content length
  • Content seen time
  • Content identifier
  • Added comments for all tables and columns
sql/upgrades/137.sql
87

The db fields do not give information to add comments. Can you give some information about what is being stored. Specially fields like date offset.

106

Not sure what to write here. Is is the complete revision history of a software origin?

134

What to add here for date offset.

152

Is there a better comment than "Primary key" for this field.

154

The dev comment says

identifier across which we're bucketing objects

Does this comment convey the meaning?

@twitu please use imperative form while writing commit messages. Also, do keep in mind the why-not-how when writing the body of the commit message.

  • Add comments to few columns in dbversion, task and task_run
  • Change double quoted comments to single quoted comments
  • Added comments to tables dbversion, content, skipped_content and fetch_history
  • Added comments for all tables and columns
  • Converted double quote comments to single quote comments

fyi, the error is:

---------------------------- Captured stderr setup -----------------------------
psql:/home/jenkins/workspace/DSTO/tox/swh/storage/sql/30-swh-schema.sql:110: NOTICE:  identifier "Content blobs observed but not ingested in
 Software heritage archive" will be truncated to "Content blobs observed but not ingested in
 Software heritage a"
psql:/home/jenkins/workspace/DSTO/tox/swh/storage/sql/30-swh-schema.sql:110: ERROR:  syntax error at or near ""Content blobs observed but not ingested in
 Software heritage archive""
LINE 1: comment on table skipped_content is "Content blobs observed ...
                                            ^

The comments are not complete. I will fix the syntax issue in the next commit along with the missing comments. I am not sure what comments to add for the revision table.

  • Add additional column comments
  • Double quote syntax error
ardumont retitled this revision from Added comments to tables dbversion, content, skipped_content and fetch_history to Add comments to tables dbversion, content, skipped_content and fetch_history.Jun 18 2019, 10:26 AM

Does this diff require any other changes?

sql/upgrades/137.sql
12

not release, but deployment

85

Should also mention that it's the "raw" name for most CVSs

swh/storage/sql/30-swh-schema.sql
14

not release, but deployment

twitu marked an inline comment as done.
  • Make suggested changes
twitu marked an inline comment as done.
  • List all options
twitu added inline comments.
swh/storage/sql/30-swh-schema.sql
82–84

No point deleting them now.

zack requested changes to this revision.Jun 19 2019, 3:52 PM

Almost there !
(and thanks a lot for your persistence on this one)

sql/upgrades/137.sql
11

SQL (uppercase)

there are a few other instances of this below, please fix all of them

swh/storage/sql/30-swh-schema.sql
84

URL (uppercase)

109

We can leave "software heritage" implicit, given the context. How about "Content blobs observed, but not ingested in the archive" here?

143

I don't think my previous comment on these fields have actually been addressed. My proposal is something like "standard output of the fetch operation" / "standard error of the fetch operation".

This revision now requires changes to proceed.Jun 19 2019, 3:52 PM
  • Make changes as per review

Add comments to tables and columns

This revision is now accepted and ready to land.Jun 20 2019, 10:09 AM
This revision was automatically updated to reflect the committed changes.