Page MenuHomeSoftware Heritage

Add comments to tables dbversion, content, skipped_content and fetch_history
ClosedPublic

Authored by twitu on Jun 14 2019, 8:19 AM.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

twitu created this revision.Jun 14 2019, 8:19 AM
twitu added a comment.EditedJun 14 2019, 8:20 AM

This diff is in reference to T1527.

sha1_git is not "Git id of sha1 checksum", it's the hash of the content as a git object (ie. with the header used by git to identify blobs).

vlorentz requested changes to this revision.Jun 14 2019, 3:28 PM

Could you also add comments on these tables, and use is instead of IS?

swh/storage/sql/30-swh-schema.sql
63

Not "file creation", but the first time we saw that content.

65

It's not an identifier of the "file content", just of the content

83–85

I don't think comments on these columns are relevant. Maybe just one of the type.

114

Same

140

This comment should go into more details

This revision now requires changes to proceed.Jun 14 2019, 3:28 PM
zack requested changes to this revision.Jun 14 2019, 3:28 PM
zack added a subscriber: zack.
zack added inline comments.
swh/storage/sql/30-swh-schema.sql
58–65

the style of these could be improved to be less awkward, e.g., "length of file content" -> "file content length"

also, please check our glossary, as it will give hints of how to make this both shorter and clearer, e.g., it's not "file content", it's just "content"

141–142

these are incorrect, you should just refer to the general notions of "standard output / standard error", the intended reader will know what they mean

twitu added inline comments.Jun 14 2019, 5:30 PM
swh/storage/sql/30-swh-schema.sql
58–65

I prefer using active voice for comments. But the comments will become repetitive like,

  • Content length
  • Content seen time
  • Content identifier
ardumont edited the summary of this revision. (Show Details)Jun 14 2019, 5:43 PM
twitu updated this revision to Diff 5256.Jun 14 2019, 9:17 PM
  • Added comments for all tables and columns
twitu added inline comments.Jun 14 2019, 9:29 PM
sql/upgrades/137.sql
88

The db fields do not give information to add comments. Can you give some information about what is being stored. Specially fields like date offset.

107

Not sure what to write here. Is is the complete revision history of a software origin?

135

What to add here for date offset.

153

Is there a better comment than "Primary key" for this field.

155

The dev comment says

identifier across which we're bucketing objects

Does this comment convey the meaning?

@twitu please use imperative form while writing commit messages. Also, do keep in mind the why-not-how when writing the body of the commit message.

twitu updated this revision to Diff 5260.Jun 15 2019, 7:15 PM
  • Add comments to few columns in dbversion, task and task_run
  • Change double quoted comments to single quoted comments
twitu updated this revision to Diff 5263.Jun 15 2019, 7:26 PM
  • Added comments to tables dbversion, content, skipped_content and fetch_history
  • Added comments for all tables and columns
  • Converted double quote comments to single quote comments

fyi, the error is:

---------------------------- Captured stderr setup -----------------------------
psql:/home/jenkins/workspace/DSTO/tox/swh/storage/sql/30-swh-schema.sql:110: NOTICE:  identifier "Content blobs observed but not ingested in
 Software heritage archive" will be truncated to "Content blobs observed but not ingested in
 Software heritage a"
psql:/home/jenkins/workspace/DSTO/tox/swh/storage/sql/30-swh-schema.sql:110: ERROR:  syntax error at or near ""Content blobs observed but not ingested in
 Software heritage archive""
LINE 1: comment on table skipped_content is "Content blobs observed ...
                                            ^
twitu added a comment.Jun 17 2019, 3:20 PM

The comments are not complete. I will fix the syntax issue in the next commit along with the missing comments. I am not sure what comments to add for the revision table.

twitu updated this revision to Diff 5285.Jun 17 2019, 6:00 PM
  • Add additional column comments
twitu updated this revision to Diff 5287.Jun 17 2019, 6:49 PM
  • Double quote syntax error
ardumont retitled this revision from Added comments to tables dbversion, content, skipped_content and fetch_history to Add comments to tables dbversion, content, skipped_content and fetch_history.Jun 18 2019, 10:26 AM
twitu added a comment.Jun 18 2019, 1:52 PM

Does this diff require any other changes?

vlorentz added inline comments.Jun 18 2019, 2:07 PM
sql/upgrades/137.sql
13

not release, but deployment

86

Should also mention that it's the "raw" name for most CVSs

swh/storage/sql/30-swh-schema.sql
15

not release, but deployment

twitu updated this revision to Diff 5315.Jun 18 2019, 4:28 PM
twitu marked an inline comment as done.
  • Make suggested changes
twitu updated this revision to Diff 5316.Jun 18 2019, 4:36 PM
twitu marked an inline comment as done.
  • List all options
twitu marked 7 inline comments as done.Jun 18 2019, 4:39 PM
twitu added inline comments.
swh/storage/sql/30-swh-schema.sql
83–85

No point deleting them now.

zack requested changes to this revision.Jun 19 2019, 3:52 PM

Almost there !
(and thanks a lot for your persistence on this one)

sql/upgrades/137.sql
12

SQL (uppercase)

there are a few other instances of this below, please fix all of them

swh/storage/sql/30-swh-schema.sql
85

URL (uppercase)

110

We can leave "software heritage" implicit, given the context. How about "Content blobs observed, but not ingested in the archive" here?

143

I don't think my previous comment on these fields have actually been addressed. My proposal is something like "standard output of the fetch operation" / "standard error of the fetch operation".

This revision now requires changes to proceed.Jun 19 2019, 3:52 PM
twitu updated this revision to Diff 5373.Jun 19 2019, 5:02 PM
  • Make changes as per review
twitu marked 5 inline comments as done.Jun 19 2019, 5:02 PM
zack accepted this revision.Jun 19 2019, 5:55 PM
twitu updated this revision to Diff 5381.Jun 20 2019, 9:20 AM

Add comments to tables and columns

vlorentz accepted this revision.Jun 20 2019, 10:09 AM
This revision is now accepted and ready to land.Jun 20 2019, 10:09 AM
This revision was automatically updated to reflect the committed changes.