Page MenuHomeSoftware Heritage
Feed Advanced Search

Apr 27 2020

zack closed T2379: SWHID: expand spec to allow IRI characters, a subtask of T2262: Deal with IRIs, as Resolved.
Apr 27 2020, 3:33 PM · Storage manager, Data Model
zack closed T2379: SWHID: expand spec to allow IRI characters as Resolved by committing rDMOD3ef4843c8955: SWHID spec: add support for IRI.
Apr 27 2020, 3:33 PM · Storage manager, Data Model
moranegg renamed T2075: Implement metadata authority specification from Implement metadata provider specification to Implement metadata authority specification.
Apr 27 2020, 11:35 AM · Storage manager, Metadata workflow
moranegg added a comment to T2075: Implement metadata authority specification.

Changing provider in title to authority for terminology consistency.

Apr 27 2020, 11:35 AM · Storage manager, Metadata workflow

Apr 26 2020

zack added a comment to T2379: SWHID: expand spec to allow IRI characters.

Upon (admittedly quick) review, I don't think that anything more than D3068 is needed to address this.
Double-checking/feedback welcome!

Apr 26 2020, 4:46 PM · Storage manager, Data Model
zack added a revision to T2379: SWHID: expand spec to allow IRI characters: D3068: SWHID spec: add support for IRI.
Apr 26 2020, 4:45 PM · Storage manager, Data Model

Apr 24 2020

anlambert added a comment to T2262: Deal with IRIs.

I wrote that little script to check the number of origin IRIs and URIs in the archive

Apr 24 2020, 7:23 PM · Storage manager, Data Model
douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3062: Move the content of swh/objstorage/__init__.py in swh/objstorage/factory.py.
Apr 24 2020, 3:54 PM · Object storage, Storage manager, Journal
zack triaged T2379: SWHID: expand spec to allow IRI characters as Normal priority.
Apr 24 2020, 3:32 PM · Storage manager, Data Model
vlorentz updated the task description for T2262: Deal with IRIs.
Apr 24 2020, 3:18 PM · Storage manager, Data Model
vlorentz renamed T2262: Deal with IRIs from SWHID: deal with IRIs to Deal with IRIs.
Apr 24 2020, 1:29 PM · Storage manager, Data Model
douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3056: Deprecate the `config-path` argument of the `swh storage rpc-serve` command.
Apr 24 2020, 11:29 AM · Object storage, Storage manager, Journal
zack renamed T2262: Deal with IRIs from Dealing with IRIs to SWHID: deal with IRIs.
Apr 24 2020, 10:28 AM · Storage manager, Data Model

Apr 23 2020

douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3058: Adapt journal client loading to swh.journal 0.0.31.
Apr 23 2020, 4:58 PM · Object storage, Storage manager, Journal

Apr 22 2020

douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3044: Move get_journal_client function to swh.journal.client.
Apr 22 2020, 4:50 PM · Object storage, Storage manager, Journal
ardumont renamed T2355: Make swh-journal independent from swh-storage or swh-objstorage from Make swh-journal independant from swh-storage or swh-objstorage to Make swh-journal independent from swh-storage or swh-objstorage.
Apr 22 2020, 3:50 PM · Object storage, Storage manager, Journal
douardda renamed T2355: Make swh-journal independent from swh-storage or swh-objstorage from Merge parts of swh-journal in swh-storage to Make swh-journal independant from swh-storage or swh-objstorage.
Apr 22 2020, 3:41 PM · Object storage, Storage manager, Journal
douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3043: Extract kafka-related pytest fixtures in a pytest plugin module.
Apr 22 2020, 3:38 PM · Object storage, Storage manager, Journal
ardumont triaged T2372: origin visit reaper: new janitorial process in charge of updating lingering origin visit in 'ongoing' state as Normal priority.
Apr 22 2020, 2:59 PM · Storage manager

Apr 21 2020

ardumont updated the task description for T2310: Make origin visits immutable.
Apr 21 2020, 2:22 PM · Storage manager, Data Model

Apr 20 2020

ardumont added a parent task for T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: T1991: Implement a Guix/Nix loader.
Apr 20 2020, 9:43 AM · Package Loader, Storage manager

Apr 17 2020

ardumont closed T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error as Resolved.
Apr 17 2020, 11:33 AM · Package Loader, Storage manager
ardumont added a comment to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.

and it finished alright \m/

Apr 17 2020, 11:33 AM · Package Loader, Storage manager

Apr 15 2020

ardumont added a comment to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.

while not finished, run is still happy so far

Apr 15 2020, 10:34 AM · Package Loader, Storage manager

Apr 14 2020

ardumont changed the status of T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error from Open to Work in Progress.
Apr 14 2020, 6:15 PM · Package Loader, Storage manager
ardumont added a comment to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.

Deployed.

Apr 14 2020, 6:15 PM · Package Loader, Storage manager
ardumont edited projects for T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error, added: Package Loader; removed Core Loader.
Apr 14 2020, 2:32 PM · Package Loader, Storage manager
ardumont added a comment to T2332: Analyze hash collisions.

Remains open because there remain decision to be made
about the few real ones (3) we have so far [1]

Apr 14 2020, 2:08 PM · Object storage, Storage manager
ardumont closed T2019: race condition during concurrent loading of the same objects from multiple origins as Resolved.

This can be closed now thanks to D2977.

Apr 14 2020, 2:07 PM · Storage manager
ardumont added a comment to T2332: Analyze hash collisions.

So our high number of falsy hash collisions is fixed thanks to D2977 now \m/.

Apr 14 2020, 2:06 PM · Object storage, Storage manager
douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3010: Copy the graph replayer component from swh-journal.
Apr 14 2020, 11:14 AM · Object storage, Storage manager, Journal
douardda added a revision to T2355: Make swh-journal independent from swh-storage or swh-objstorage: D3008: Copy the backfiller component from swh-journal.
Apr 14 2020, 11:14 AM · Object storage, Storage manager, Journal
ardumont added a revision to T2310: Make origin visits immutable: D2939: cassandra storage: Adapt internal implementations to use origin visit status model representation.
Apr 14 2020, 10:47 AM · Storage manager, Data Model
ardumont added a revision to T2310: Make origin visits immutable: D2938: pg-storage: Adapt internal implementations to use origin visit status model representation.
Apr 14 2020, 10:47 AM · Storage manager, Data Model
ardumont added a revision to T2310: Make origin visits immutable: D2937: in_memory storage: Adapt internal implementations to use origin visit status model representation.
Apr 14 2020, 10:47 AM · Storage manager, Data Model
ardumont added a revision to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: D3012: retry: Make content_add endpoints maximize content writes to storage.
Apr 14 2020, 10:28 AM · Package Loader, Storage manager
ardumont added a revision to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: D2966: storage.buffer: Add a new clear_buffers operation for the buffer proxy.
Apr 14 2020, 10:25 AM · Package Loader, Storage manager
ardumont added a revision to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: D2973: package.loader: Insert consistently data to storage, clear buffer proxy state in case of error when skipping artifact.
Apr 14 2020, 10:25 AM · Package Loader, Storage manager
ardumont added a revision to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: D3014: package.loader: Flush and clear regularly internal proxy storage states.
Apr 14 2020, 10:21 AM · Package Loader, Storage manager

Apr 11 2020

ardumont renamed T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error from nixguix: Fails to finish to nixguix: Fails to finish as it's stuck in a loop up to memory error.
Apr 11 2020, 11:51 AM · Package Loader, Storage manager

Apr 9 2020

ardumont updated the task description for T2355: Make swh-journal independent from swh-storage or swh-objstorage.
Apr 9 2020, 4:35 PM · Object storage, Storage manager, Journal
ardumont added a comment to T2310: Make origin visits immutable.

We have a somewhat common (but fairly infrequent) pattern of visits crashing hard, and lingering forever in an ongoing state.

Apr 9 2020, 11:06 AM · Storage manager, Data Model
ardumont added a comment to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.

In the mean time (pending reviews), a new load was triggered without the proxy storage and all went fine.

Apr 9 2020, 9:39 AM · Package Loader, Storage manager

Apr 8 2020

ardumont updated the task description for T2332: Analyze hash collisions.
Apr 8 2020, 4:15 PM · Object storage, Storage manager
ardumont added a comment to T2332: Analyze hash collisions.

An interesting experiment, disabling the proxy buffer storage in the loader nixguix configuration.
And the number of hashcollision dropped to 0 (no new event for that loader since yesterday around 6pm our time).

Apr 8 2020, 10:58 AM · Object storage, Storage manager
vlorentz added a revision to T2332: Analyze hash collisions: D2977: Prevent erroneous HashCollisions by using the same ctime for all rows..
Apr 8 2020, 10:53 AM · Object storage, Storage manager

Apr 7 2020

ardumont added a comment to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.

Right now, heading for 2. for now as the solution for 3. is still a pending question [2]

Apr 7 2020, 3:05 PM · Package Loader, Storage manager
ardumont updated the task description for T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error.
Apr 7 2020, 2:55 PM · Package Loader, Storage manager
olasd added a comment to T2310: Make origin visits immutable.

I'll start with a general reasoning about origin visit vs. origin visit state objects in our "conceptual" data model, as it was sprinkled throughout my comment initially.

Apr 7 2020, 12:43 PM · Storage manager, Data Model
ardumont added projects to T2352: nixguix: Fails to finish as it's stuck in a loop up to memory error: Core Loader, Storage manager.
Apr 7 2020, 11:56 AM · Package Loader, Storage manager
ardumont added a comment to T2310: Make origin visits immutable.

It looks you misread what I meant: I was talking about a new OriginVisitUpdate with a snapshot "inconsistent" with the previous snapshot reported by the previous OriginVisitUpdate for the same visit.

Apr 7 2020, 11:38 AM · Storage manager, Data Model
vlorentz added a comment to T2310: Make origin visits immutable.
  • do we allow an OriginVisitUpdate(status='ongoing', snapshot=yyy) with the snapshot yyy not a superset of a previous update?

It doesn't make sense to have this, but I'm not sure we should care.

I think this is a rather simple check to implement so I don't see why not do it. Intrinsic robustness is always (if not over complex) a good thing add.

Apr 7 2020, 11:25 AM · Storage manager, Data Model
douardda added a comment to T2310: Make origin visits immutable.

Thanks for the questions. I'm unsure about some questions and i replied as best
i could.

do we allow an OriginVisitUpdate(status='ongoing', snaphost=None)? what would
be the meaning of this?

Yes. It means "loading started, so no snapshot yet".
That sounds sensible ;)

Apr 7 2020, 10:56 AM · Storage manager, Data Model
douardda added a comment to T2310: Make origin visits immutable.

We currently don't have "created" (so no "start" either), but it would make sense to create it.

Regarding this model, a few questions come to my mind:

  • do we allow an OriginVisitUpdate(status='ongoing', snaphost=None)? what would be the meaning of this? or do we enforce one just after the created step to model the start transition?

This could mean these things:

  1. on a first update, to mean the visit was created (but we don't need it if we have a "created" state)
Apr 7 2020, 10:47 AM · Storage manager, Data Model

Apr 6 2020

ardumont added a comment to T2310: Make origin visits immutable.

(also, i agree with @vlorentz's faster first reply ;)

Apr 6 2020, 4:24 PM · Storage manager, Data Model
vlorentz added a comment to T2310: Make origin visits immutable.

The rest is only in origin_visit_update.

Apr 6 2020, 1:26 PM · Storage manager, Data Model
ardumont added a comment to T2310: Make origin visits immutable.

Thanks for the questions. I'm unsure about some questions and i replied as best
i could.

Apr 6 2020, 12:13 PM · Storage manager, Data Model
vlorentz added a comment to T2310: Make origin visits immutable.

We currently don't have "created" (so no "start" either), but it would make sense to create it.

Apr 6 2020, 12:04 PM · Storage manager, Data Model
douardda added a comment to T2310: Make origin visits immutable.

As I understand this, an origin visit, consisting in one OriginVisit object plus a list of OriginVisitUpdate represent the process of visiting an origin to load its content in the archive.

Apr 6 2020, 11:00 AM · Storage manager, Data Model

Apr 3 2020

ardumont added a comment to T2332: Analyze hash collisions.

All in all, this task serves the purpose of being sure those exists.

Apr 3 2020, 7:22 PM · Object storage, Storage manager

Apr 2 2020

vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 3:05 PM · Storage manager, Data Model
vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 3:04 PM · Storage manager, Data Model
vlorentz added a comment to T2346: Decide on the semantics of origin-visit status(es).

@olasd So we agree to go with #1, right?

Apr 2 2020, 2:32 PM · Storage manager, Data Model
vlorentz reopened T2346: Decide on the semantics of origin-visit status(es) as "Open".
Apr 2 2020, 2:31 PM · Storage manager, Data Model
vlorentz reopened T2346: Decide on the semantics of origin-visit status(es), a subtask of T2310: Make origin visits immutable, as Open.
Apr 2 2020, 2:31 PM · Storage manager, Data Model
vlorentz closed T2346: Decide on the semantics of origin-visit status(es), a subtask of T2310: Make origin visits immutable, as Resolved.
Apr 2 2020, 2:31 PM · Storage manager, Data Model
vlorentz closed T2346: Decide on the semantics of origin-visit status(es) as Resolved.
Apr 2 2020, 2:31 PM · Storage manager, Data Model
vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 11:40 AM · Storage manager, Data Model
vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 11:38 AM · Storage manager, Data Model
vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 11:27 AM · Storage manager, Data Model
olasd added a comment to T2346: Decide on the semantics of origin-visit status(es).

I'd say, let's keep the metadata field for now, just to avoid migrating back and forth.

And if we want to pack it with lots of data, we can switch from semantic 1 to semantic 2 later, which shouldn't be too much trouble.

Apr 2 2020, 11:17 AM · Storage manager, Data Model
vlorentz added a comment to T2346: Decide on the semantics of origin-visit status(es).
In T2346#43055, @olasd wrote:

The only concern I have about removing the metadata field, is that at some point I'd like the "size" of the visit to enter into consideration in the feedback loop of the scheduler (T2345). A metadata field in the visit with the count of objects added (or even just a "visit score") could be a way of recording that info. It would also help the web frontend show the activity for a given repository.

Apr 2 2020, 11:05 AM · Storage manager, Data Model
olasd added a comment to T2346: Decide on the semantics of origin-visit status(es).

Thanks for recording this.

Apr 2 2020, 11:00 AM · Storage manager, Data Model
vlorentz updated the task description for T2346: Decide on the semantics of origin-visit status(es).
Apr 2 2020, 10:57 AM · Storage manager, Data Model
vlorentz renamed T2346: Decide on the semantics of origin-visit status(es) from Semantics of origin-visit updates to Decide on the semantics of origin-visit updates.
Apr 2 2020, 10:40 AM · Storage manager, Data Model
vlorentz triaged T2346: Decide on the semantics of origin-visit status(es) as Normal priority.
Apr 2 2020, 10:39 AM · Storage manager, Data Model
vlorentz renamed T2310: Make origin visits immutable from Mutability of origin visits to Make origin visits immutable.
Apr 2 2020, 10:32 AM · Storage manager, Data Model
ardumont updated the task description for T2310: Make origin visits immutable.
Apr 2 2020, 10:23 AM · Storage manager, Data Model

Apr 1 2020

ardumont updated the task description for T2310: Make origin visits immutable.
Apr 1 2020, 4:40 PM · Storage manager, Data Model
vlorentz updated the task description for T2310: Make origin visits immutable.
Apr 1 2020, 4:37 PM · Storage manager, Data Model
vlorentz closed T2343: Decide where/how to store extrinsinc metadata, a subtask of T2075: Implement metadata authority specification, as Resolved.
Apr 1 2020, 2:29 PM · Storage manager, Metadata workflow
vlorentz closed T2343: Decide where/how to store extrinsinc metadata as Resolved.
Apr 1 2020, 2:29 PM · System administration, Storage manager
vlorentz added a comment to T2343: Decide where/how to store extrinsinc metadata.

After discussion on IRC, we're opting for option 2.

Apr 1 2020, 2:29 PM · System administration, Storage manager
vlorentz updated the task description for T2343: Decide where/how to store extrinsinc metadata.
Apr 1 2020, 2:29 PM · System administration, Storage manager
ardumont abandoned D2879: storage: Refactor internal implementations to use "origin visit update" model representation.

Closed in favor of D2937, D2938, D2939

Apr 1 2020, 12:18 PM · Storage manager
ardumont updated the summary of D2879: storage: Refactor internal implementations to use "origin visit update" model representation.
Apr 1 2020, 12:12 PM · Storage manager
ardumont updated the summary of D2879: storage: Refactor internal implementations to use "origin visit update" model representation.
Apr 1 2020, 12:12 PM · Storage manager
swh-public-ci added a comment to D2879: storage: Refactor internal implementations to use "origin visit update" model representation.

Build is green

Apr 1 2020, 12:11 PM · Storage manager
ardumont updated the diff for D2879: storage: Refactor internal implementations to use "origin visit update" model representation.

Rebase on latest master

Apr 1 2020, 12:05 PM · Storage manager
ardumont updated the summary of D2879: storage: Refactor internal implementations to use "origin visit update" model representation.
Apr 1 2020, 12:01 PM · Storage manager
ardumont updated the summary of D2879: storage: Refactor internal implementations to use "origin visit update" model representation.
Apr 1 2020, 12:00 PM · Storage manager
ardumont updated the summary of D2879: storage: Refactor internal implementations to use "origin visit update" model representation.
Apr 1 2020, 11:33 AM · Storage manager

Mar 31 2020

vlorentz edited projects for T2343: Decide where/how to store extrinsinc metadata, added: System administration; removed Metadata workflow.
Mar 31 2020, 4:45 PM · System administration, Storage manager
vlorentz raised the priority of T2343: Decide where/how to store extrinsinc metadata from Normal to High.
Mar 31 2020, 4:45 PM · System administration, Storage manager
vlorentz updated the task description for T2343: Decide where/how to store extrinsinc metadata.
Mar 31 2020, 4:45 PM · System administration, Storage manager
vlorentz triaged T2343: Decide where/how to store extrinsinc metadata as Normal priority.
Mar 31 2020, 4:44 PM · System administration, Storage manager
vlorentz updated the task description for T2075: Implement metadata authority specification.
Mar 31 2020, 11:38 AM · Storage manager, Metadata workflow
zack updated the task description for T2075: Implement metadata authority specification.
Mar 31 2020, 11:33 AM · Storage manager, Metadata workflow

Mar 30 2020

vlorentz claimed T2075: Implement metadata authority specification.
Mar 30 2020, 1:13 PM · Storage manager, Metadata workflow

Mar 25 2020

vlorentz added a comment to T2310: Make origin visits immutable.

Current plan:

Mar 25 2020, 2:23 PM · Storage manager, Data Model