Page MenuHomeSoftware Heritage

Origin-GitHubTag
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

Projects related to GitHub

Recent Activity

Jan 8 2023

gitlab-migration closed T4728: Add monitoring of API token usage as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:25 PM · Metrics/monitoring, Origin-GitHub
gitlab-migration closed T4344: Many NotFound repositories on GitHub since 2022-06-15 or 2022-06-16 as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:24 PM · Origin-GitHub
gitlab-migration closed T2207: Improve ingestion efficiency as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:22 PM · Origin-GitLab, Origin-GitHub, Roadmap 2020
gitlab-migration changed the status of T3542: Decide what metadata we want to / can collect from GitHub from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:02 PM · Origin-GitHub, Extrinsic metadata
gitlab-migration changed the status of T3273: Use "fork" relationships to speed-up initial load of large repositories from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:02 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
gitlab-migration changed the status of T1740: fetch extrinsic origin metadata from GitHub, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, from Resolved to Migrated.
Jan 8 2023, 9:59 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
gitlab-migration changed the status of T1740: fetch extrinsic origin metadata from GitHub from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 9:59 PM · Metadata workflow, Origin-GitHub
gitlab-migration changed the status of T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders, a subtask of T1740: fetch extrinsic origin metadata from GitHub, from Resolved to Migrated.
Jan 8 2023, 9:59 PM · Metadata workflow, Origin-GitHub
gitlab-migration changed the status of T1344: Write specs about metadata workflow , a subtask of T1740: fetch extrinsic origin metadata from GitHub, from Resolved to Migrated.
Jan 8 2023, 9:58 PM · Metadata workflow, Origin-GitHub
gitlab-migration changed the status of T382: stay up to date w.r.t. new GitHub repositories from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 9:56 PM · Restricted Project, General, Origin-GitHub
gitlab-migration changed the status of T66: clone and load fork GitHub repositories, a subtask of T382: stay up to date w.r.t. new GitHub repositories, from Resolved to Migrated.
Jan 8 2023, 9:55 PM · Restricted Project, General, Origin-GitHub
gitlab-migration closed T3655: loader git: enable global deduplication of head branches before fetching them, a subtask of T2207: Improve ingestion efficiency , as Migrated.
Jan 8 2023, 5:03 PM · Origin-GitLab, Origin-GitHub, Roadmap 2020
gitlab-migration closed T846: Some objects from the original GitHub import have never actually been imported., a subtask of T2207: Improve ingestion efficiency , as Migrated.
Jan 8 2023, 4:59 PM · Origin-GitLab, Origin-GitHub, Roadmap 2020
gitlab-migration changed the status of T4219: Investigate why GitHub fork detection did not bring a speed-up, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, from Resolved to Migrated.
Jan 8 2023, 4:36 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
gitlab-migration changed the status of T4219: Investigate why GitHub fork detection did not bring a speed-up from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 4:36 PM · Origin-GitHub, Git loader
gitlab-migration changed the status of T4186: Allow loaders to fetch extrinsic metadata, a subtask of T1740: fetch extrinsic origin metadata from GitHub, from Resolved to Migrated.
Jan 8 2023, 4:36 PM · Metadata workflow, Origin-GitHub
gitlab-migration changed the status of T3544: Deal with GitHub removing support for git:// URLs from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 4:35 PM · Origin-GitHub, Git loader
gitlab-migration changed the status of T3544: Deal with GitHub removing support for git:// URLs, a subtask of T2207: Improve ingestion efficiency , from Resolved to Migrated.
Jan 8 2023, 4:35 PM · Origin-GitLab, Origin-GitHub, Roadmap 2020
gitlab-migration changed the status of T313: Retrieve fork information for github repositories in swh.lister.github, a subtask of T382: stay up to date w.r.t. new GitHub repositories, from Wontfix to Migrated.
Jan 8 2023, 4:18 PM · Restricted Project, General, Origin-GitHub

Dec 15 2022

vlorentz added a revision to T4728: Add monitoring of API token usage: D8959: github: Export statsd metrics about API requests and token usage.
Dec 15 2022, 12:22 PM · Metrics/monitoring, Origin-GitHub
vlorentz triaged T4728: Add monitoring of API token usage as Normal priority.
Dec 15 2022, 12:18 PM · Metrics/monitoring, Origin-GitHub

Dec 1 2022

vlorentz closed T3273: Use "fork" relationships to speed-up initial load of large repositories as Resolved.
Dec 1 2022, 4:18 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz closed T4219: Investigate why GitHub fork detection did not bring a speed-up, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, as Resolved.
Dec 1 2022, 4:18 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz closed T4219: Investigate why GitHub fork detection did not bring a speed-up as Resolved.
Dec 1 2022, 4:18 PM · Origin-GitHub, Git loader

Nov 4 2022

olasd added a comment to T4219: Investigate why GitHub fork detection did not bring a speed-up.

swh.loader.git 2.1.0 has now been deployed on all workers.

Nov 4 2022, 9:25 PM · Origin-GitHub, Git loader

Nov 3 2022

olasd added a revision to T4219: Investigate why GitHub fork detection did not bring a speed-up: D8808: Eagerly populate the set of local heads in RepoRepresentation.__init__.
Nov 3 2022, 5:28 PM · Origin-GitHub, Git loader

Oct 19 2022

gitlab-migration changed the status of T4242: Deployed loader.git v1.8, a subtask of T4219: Investigate why GitHub fork detection did not bring a speed-up, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Origin-GitHub, Git loader
gitlab-migration changed the status of T4225: Deploy a more recent version of prometheus-statsd-exporter on all nodes, a subtask of T4219: Investigate why GitHub fork detection did not bring a speed-up, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Origin-GitHub, Git loader
gitlab-migration changed the status of T4206: prod: Deploy metadata loader v0.0.2, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
gitlab-migration changed the status of T4206: prod: Deploy metadata loader v0.0.2, a subtask of T1740: fetch extrinsic origin metadata from GitHub, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Metadata workflow, Origin-GitHub
gitlab-migration changed the status of T4193: staging: Deploy metadata loader, a subtask of T1740: fetch extrinsic origin metadata from GitHub, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Metadata workflow, Origin-GitHub

Jun 21 2022

vlorentz triaged T4344: Many NotFound repositories on GitHub since 2022-06-15 or 2022-06-16 as Normal priority.
Jun 21 2022, 10:08 AM · Origin-GitHub

May 30 2022

vlorentz added a parent task for T3273: Use "fork" relationships to speed-up initial load of large repositories: T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.
May 30 2022, 3:41 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

May 20 2022

vlorentz added revisions to T4219: Investigate why GitHub fork detection did not bring a speed-up: D7873: Add an unweighted average for filtered_objects + fix existing metric name, D7876: Log summary of filtered objects in store_data.
May 20 2022, 3:54 PM · Origin-GitHub, Git loader
vlorentz added a revision to T4219: Investigate why GitHub fork detection did not bring a speed-up: D7871: Add metrics in store_data on ratios of objects already stored.
May 20 2022, 1:48 PM · Origin-GitHub, Git loader
vlorentz added a comment to T4219: Investigate why GitHub fork detection did not bring a speed-up.

I did some profiling early this week, and found that when incrementally loading a linux fork we already visited:

May 20 2022, 10:55 AM · Origin-GitHub, Git loader

May 16 2022

vlorentz added a comment to T4219: Investigate why GitHub fork detection did not bring a speed-up.

This indicates we should load incrementally from the last snapshot of the origin AND the last snapshot of its parent, so we would capture these new commits without reloading half of the parent's history. As @olasd puts it, "that's a (very) lightweight way of doing global deduplication".

May 16 2022, 3:33 PM · Origin-GitHub, Git loader

May 13 2022

ardumont added a subtask for T4219: Investigate why GitHub fork detection did not bring a speed-up: T4242: Deployed loader.git v1.8.
May 13 2022, 6:01 PM · Origin-GitHub, Git loader
olasd closed T4225: Deploy a more recent version of prometheus-statsd-exporter on all nodes, a subtask of T4219: Investigate why GitHub fork detection did not bring a speed-up, as Resolved.
May 13 2022, 4:20 PM · Origin-GitHub, Git loader
vlorentz added a revision to T3273: Use "fork" relationships to speed-up initial load of large repositories: D7831: Use all base snapshots in determine_wants().
May 13 2022, 3:23 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a revision to T4219: Investigate why GitHub fork detection did not bring a speed-up: D7831: Use all base snapshots in determine_wants().
May 13 2022, 3:23 PM · Origin-GitHub, Git loader
vlorentz updated subscribers of T4219: Investigate why GitHub fork detection did not bring a speed-up.

https://grafana.softwareheritage.org/d/FqGC4zu7z/vlorentz-loader-metrics?orgId=1&var-environment=production&var-interval=1h&var-visit_type=git&var-has_parent_origins=True shows we spend a considerable amount of time loading data from git repositories with an existing visit + a parent:

May 13 2022, 3:21 PM · Origin-GitHub, Git loader

May 10 2022

vlorentz added a comment to T3273: Use "fork" relationships to speed-up initial load of large repositories.

Currently can't do it on GitLab while logged out: https://gitlab.com/gitlab-org/gitlab/-/issues/361952

May 10 2022, 4:13 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

May 6 2022

olasd changed the status of T4225: Deploy a more recent version of prometheus-statsd-exporter on all nodes, a subtask of T4219: Investigate why GitHub fork detection did not bring a speed-up, from Open to Work in Progress.
May 6 2022, 5:00 PM · Origin-GitHub, Git loader

May 3 2022

vlorentz removed a subtask for T3273: Use "fork" relationships to speed-up initial load of large repositories: T2202: Collect extrinsic metadata.
May 3 2022, 11:16 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a parent task for T1740: fetch extrinsic origin metadata from GitHub: T3273: Use "fork" relationships to speed-up initial load of large repositories.
May 3 2022, 11:16 AM · Metadata workflow, Origin-GitHub
vlorentz added subtasks for T3273: Use "fork" relationships to speed-up initial load of large repositories: T1740: fetch extrinsic origin metadata from GitHub, T2202: Collect extrinsic metadata.
May 3 2022, 11:16 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a subtask for T3273: Use "fork" relationships to speed-up initial load of large repositories: T4219: Investigate why GitHub fork detection did not bring a speed-up.
May 3 2022, 11:15 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a parent task for T4219: Investigate why GitHub fork detection did not bring a speed-up: T3273: Use "fork" relationships to speed-up initial load of large repositories.
May 3 2022, 11:15 AM · Origin-GitHub, Git loader
vlorentz closed T1740: fetch extrinsic origin metadata from GitHub as Resolved.
May 3 2022, 11:08 AM · Metadata workflow, Origin-GitHub