Page MenuHomeSoftware Heritage
Feed All Stories

Jul 2 2020

douardda requested changes to D3394: Improve test coverage and type coverage for copy_to.

Looks globally fine to me, but I have a few comments/requests.

Jul 2 2020, 2:17 PM
swh-public-ci added a comment to D3394: Improve test coverage and type coverage for copy_to.

Build is green

Jul 2 2020, 1:34 PM
olasd created D3394: Improve test coverage and type coverage for copy_to.
Jul 2 2020, 1:32 PM
ardumont added a comment to P709 ERROR swh/core/api/tests/test_async.py - ValueError: option names {'--aiohttp-fast'} already added.

tox run is fine

Jul 2 2020, 12:40 PM
douardda created P710 (An Untitled Masterwork).
Jul 2 2020, 12:31 PM
ardumont created P709 ERROR swh/core/api/tests/test_async.py - ValueError: option names {'--aiohttp-fast'} already added.
Jul 2 2020, 12:25 PM
ardumont updated the task description for T2310: Make origin visits immutable.
Jul 2 2020, 12:23 PM · Storage manager, Data Model
ardumont retitled D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function from Factorize common loader behavior within from_disk.iter_directory to Factorize common loader behavior within swh.model.from_disk.iter_directory function.
Jul 2 2020, 12:23 PM
ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.
Jul 2 2020, 12:12 PM
ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.
Jul 2 2020, 12:12 PM
Harbormaster failed remote builds in B13256: Diff 12036 for D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function!
Jul 2 2020, 12:09 PM
swh-public-ci added a comment to D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function.

Build has FAILED

Jul 2 2020, 12:09 PM
ardumont created D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function.
Jul 2 2020, 12:08 PM
zack added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)
I agree it might be a technical solution, *but*, I'm not sure I see the point.
Didn't you agree that having a "lookup service" from tarball/container checksums to SWHIDs (the Software Heritage identifiers, that can then be used to lookup stuff in the archive) would be enough to satisfy distro needs?
If yes, then "archiving container metadata" could be replaced by simply having a way to add entries to the lookup table. And allowing distros to do so is option that we can explore. (Once the service exists, of course.)

Jul 2 2020, 12:07 PM · Data Model
Harbormaster failed remote builds in B13255: Diff 12035 for D3392: loader-core: Reuse swh.model.from_disk.iter_directory function!
Jul 2 2020, 12:06 PM
swh-public-ci added a comment to D3392: loader-core: Reuse swh.model.from_disk.iter_directory function.

Build has FAILED

Jul 2 2020, 12:06 PM
ardumont created D3392: loader-core: Reuse swh.model.from_disk.iter_directory function.
Jul 2 2020, 12:05 PM
swh-public-ci added a comment to D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Build is green

Jul 2 2020, 12:04 PM
ardumont updated the diff for D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Make the returned results a tuple of lists

Jul 2 2020, 12:02 PM
civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Do I get it right that the primary reason why tarballs aren't systematically archived is that doing so would be too expensive storage-wise (no deduplication)?

Jul 2 2020, 12:00 PM · Data Model
swh-public-ci added a comment to D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Build is green

Jul 2 2020, 12:00 PM
ardumont created D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.
Jul 2 2020, 11:58 AM
anlambert committed rDWAPPS76d9162807e0: origin_visits/get_origin_visit: Improve default visit picking strategy (authored by anlambert).
origin_visits/get_origin_visit: Improve default visit picking strategy
Jul 2 2020, 11:39 AM
anlambert closed D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.
Jul 2 2020, 11:39 AM
swh-public-ci added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Build is green

Jul 2 2020, 11:35 AM
ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.
Jul 2 2020, 11:24 AM
anlambert updated the diff for D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Update: Improve tests implementation

Jul 2 2020, 11:24 AM
ardumont added a comment to D3383: Implement {directory,revision,release,snapshot}_metadata_{add,get}..

I'll rewrite this using swh-model and only two endpoints for all types

Jul 2 2020, 11:22 AM
ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.
Jul 2 2020, 11:17 AM
vlorentz abandoned D3383: Implement {directory,revision,release,snapshot}_metadata_{add,get}..

I'll rewrite this using swh-model and only two endpoints for all types

Jul 2 2020, 11:11 AM
swh-public-ci added a comment to D3390: Unify object_type some more within the merkle and from_disk modules.

Build is green

Jul 2 2020, 11:10 AM
ardumont created D3390: Unify object_type some more within the merkle and from_disk modules.
Jul 2 2020, 11:09 AM
vlorentz closed D3382: Move tests of content_metadata_* next to origin_metadata_*.

Landed as 248c277445adbae5813ba80ce0618858d8126634.

Jul 2 2020, 11:05 AM
vlorentz committed rDSTO248c277445ad: Move tests of content_metadata_* next to origin_metadata_* (authored by vlorentz).
Move tests of content_metadata_* next to origin_metadata_*
Jul 2 2020, 11:04 AM
ardumont updated the task description for T2310: Make origin visits immutable.
Jul 2 2020, 10:32 AM · Storage manager, Data Model
vlorentz abandoned D3247: [WIP] Add content_metadata_{add,get}..

Replaced by linked diffs

Jul 2 2020, 10:12 AM

Jul 1 2020

ardumont accepted D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

nice ;)

Jul 1 2020, 8:21 PM
swh-public-ci added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Build is green

Jul 1 2020, 6:26 PM
anlambert updated the diff for D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.
  • Use new swh.storage.algos.origin.origin_get_latest_visit_status utility function in revised get_origin_visit implementation
Jul 1 2020, 6:15 PM
swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build is green

Jul 1 2020, 6:03 PM
douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

more mypy vs. attrs-strict fighting

Jul 1 2020, 6:02 PM
Harbormaster failed remote builds in B13246: Diff 12027 for D3389: Extract the extra_headers from metadata on the Revision model class!
Jul 1 2020, 5:41 PM
swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 5:41 PM
douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

make mypy happy (hopefully)

Jul 1 2020, 5:39 PM
moranegg triaged T2472: Indexing intrinsic metadata in a deposit using a sub-folder for the content as Normal priority.
Jul 1 2020, 5:35 PM · Intrinsic metadata, Indexer, SWORD deposit
Harbormaster failed remote builds in B13245: Diff 12026 for D3389: Extract the extra_headers from metadata on the Revision model class!
Jul 1 2020, 5:23 PM
swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 5:23 PM
douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

restrict extra_headers to (bytes, bytes) only

Jul 1 2020, 5:23 PM
olasd edited P708 non bytes extra_headers.
Jul 1 2020, 5:16 PM
olasd updated the language for P708 non bytes extra_headers from autodetect to remarkup.
Jul 1 2020, 5:12 PM
olasd created P708 non bytes extra_headers.
Jul 1 2020, 5:12 PM
ardumont updated the task description for T2310: Make origin visits immutable.
Jul 1 2020, 4:22 PM · Storage manager, Data Model
Harbormaster failed remote builds in B13244: Diff 12025 for D3389: Extract the extra_headers from metadata on the Revision model class!
Jul 1 2020, 4:05 PM
swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 4:05 PM
douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

improve bw-compat support, tests and hypothesis strategies

Jul 1 2020, 4:04 PM
rdicosmo added a comment to T2344: Build a connector for software deposit via Zenodo/InvenioRDM.

Great news !!

Does this mean we need to be SWORD 3 compatible?

Jul 1 2020, 4:03 PM · meta-task, Roadmap 2022, Roadmap 2020, SWORD deposit, Scientific Community Building
ardumont updated the task description for T2310: Make origin visits immutable.
Jul 1 2020, 4:01 PM · Storage manager, Data Model
ardumont committed rDJNLf6ab436d3934: requirements-swh: Add missing version requirement on swh.model (authored by ardumont).
requirements-swh: Add missing version requirement on swh.model
Jul 1 2020, 4:01 PM
olasd added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

I'm not sure about popping the extra_headers off of the incoming metadata field right now. This feels like something we want to do long term, but if you do that right now, it means you're putting upgrades of swh.model and swh.storage (with support with the new field) and all loaders in lockstep of one another. If you upgrade swh.model now, you'll lose the extra_headers until swh.storage will be able to store them.

Jul 1 2020, 4:00 PM
ardumont abandoned D3376: journal_data: Drop obsolete origin_visit fields.
Jul 1 2020, 3:57 PM
ardumont added a comment to D3376: journal_data: Drop obsolete origin_visit fields.

Landed in eda756252f2151ba602ae47646e35fab4a83ebc1

Jul 1 2020, 3:57 PM
ardumont committed rDJNLeda756252f21: journal_data: Drop obsolete origin_visit fields (authored by ardumont).
journal_data: Drop obsolete origin_visit fields
Jul 1 2020, 3:56 PM
swh-public-ci added a comment to D3376: journal_data: Drop obsolete origin_visit fields.

Build is green

Jul 1 2020, 3:55 PM
ardumont updated the test plan for D3376: journal_data: Drop obsolete origin_visit fields.
Jul 1 2020, 3:53 PM
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T2202: Collect extrinsic metadata, from Open to Work in Progress.
Jul 1 2020, 3:50 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T2074: Publish extrinsic metadata to swh-journal/Kafka, from Open to Work in Progress.
Jul 1 2020, 3:50 PM · Storage manager, Journal, Metadata workflow
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T2311: Review the deposit of CodeMeta metadata in xml (following SWORD V2 specs) , from Open to Work in Progress.
Jul 1 2020, 3:50 PM · SWORD deposit, Metadata workflow, Roadmap 2020
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T1732: Extend metadata for portals depositing software through SWORD, from Open to Work in Progress.
Jul 1 2020, 3:50 PM · Scientific Community Building, SWORD deposit
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T2328: Collect metadata about software from ScanR, from Open to Work in Progress.
Jul 1 2020, 3:50 PM · SWORD deposit, Metadata workflow, Scientific Community Building
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive from Open to Work in Progress.

heads up on status, storage related endpoints developed by vlorentz deployed in storage 0.9.0:

  • origin-metadata (we already had but it got abstracted)
  • content-metadata (new)
Jul 1 2020, 3:50 PM · Metadata workflow, Roadmap 2020
ardumont changed the status of T2306: Generic storage for extrinsic, qualified metadata related to any node of the swh archive, a subtask of T2344: Build a connector for software deposit via Zenodo/InvenioRDM, from Open to Work in Progress.
Jul 1 2020, 3:50 PM · meta-task, Roadmap 2022, Roadmap 2020, SWORD deposit, Scientific Community Building
ardumont committed rDMOD40a40f508b92: model.OriginVisit: Drop obsolete fields (authored by ardumont).
model.OriginVisit: Drop obsolete fields
Jul 1 2020, 3:45 PM
ardumont closed D3337: model.OriginVisit: Drop obsolete fields.
Jul 1 2020, 3:45 PM
ardumont committed rDSTOf2619b69b797: Rework 157 migration to ease replication setup (authored by ardumont).
Rework 157 migration to ease replication setup
Jul 1 2020, 3:41 PM
ardumont updated the task description for T2310: Make origin visits immutable.
Jul 1 2020, 3:38 PM · Storage manager, Data Model
swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build is green

Jul 1 2020, 3:19 PM
douardda added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3389: Extract the extra_headers from metadata on the Revision model class.
Jul 1 2020, 3:18 PM · Data Model
douardda created D3389: Extract the extra_headers from metadata on the Revision model class.
Jul 1 2020, 3:18 PM
anlambert planned changes to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

This needs more work in order to reuse utility functions from storage and rewrite tests using hypothesis.

Jul 1 2020, 2:59 PM
ardumont edited P707 157-bis.sql.
Jul 1 2020, 2:57 PM
ardumont edited P707 157-bis.sql.
Jul 1 2020, 2:50 PM
ardumont created P707 157-bis.sql.
Jul 1 2020, 2:46 PM
ardumont added inline comments to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.
Jul 1 2020, 2:34 PM
anlambert committed rDWAPPSb8eb7741843b: Makefile.local: Improve headless cypress tests execution time (authored by anlambert).
Makefile.local: Improve headless cypress tests execution time
Jul 1 2020, 2:34 PM
anlambert closed D3388: Makefile.local: Improve headless cypress tests execution time.
Jul 1 2020, 2:34 PM
anlambert committed rCJSWHb2c90fcea783: jobs/cypress-tests: Improve stability and execution time (authored by anlambert).
jobs/cypress-tests: Improve stability and execution time
Jul 1 2020, 2:33 PM
anlambert closed D3387: jobs/cypress-tests: Improve stability and execution time.
Jul 1 2020, 2:33 PM
ardumont added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Looks good, some remarks above though (proposal to improve coverage).

Jul 1 2020, 2:23 PM
swh-public-ci added a comment to D3388: Makefile.local: Improve headless cypress tests execution time.

Build is green

Jul 1 2020, 2:07 PM
ardumont accepted D3388: Makefile.local: Improve headless cypress tests execution time.
Jul 1 2020, 2:05 PM
ardumont accepted D3387: jobs/cypress-tests: Improve stability and execution time.
Jul 1 2020, 2:05 PM
anlambert created D3388: Makefile.local: Improve headless cypress tests execution time.
Jul 1 2020, 1:56 PM
anlambert created D3387: jobs/cypress-tests: Improve stability and execution time.
Jul 1 2020, 1:48 PM
ardumont retitled D3386: Test to reproduce issue from wip: Test to reproduce issue to Test to reproduce issue.
Jul 1 2020, 12:14 PM
ardumont abandoned D3386: Test to reproduce issue.

Agreed on irc that the priority of the task can be decreased, this:

  • metadata field is kinda deprecated
  • as this will soon get migrated to the metadata storage
  • does not impact that many repositories
Jul 1 2020, 12:14 PM
ardumont updated the summary of D3386: Test to reproduce issue.
Jul 1 2020, 12:08 PM
swh-public-ci added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Build is green

Jul 1 2020, 11:47 AM
anlambert updated the diff for D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Rebase

Jul 1 2020, 11:35 AM
anlambert committed rDWAPPSc2726ac1249c: service/snapshot_count_branches: Improve empty snapshot detection (authored by anlambert).
service/snapshot_count_branches: Improve empty snapshot detection
Jul 1 2020, 11:31 AM
anlambert closed D3384: service/snapshot_count_branches: Improve empty snapshot detection.
Jul 1 2020, 11:31 AM