Page MenuHomeSoftware Heritage

Define (and implement) scheduler performance metrics
Closed, MigratedEdits Locked

Description

To be able to compare the relative performance of different scheduler implementations and policies, among one another as well as over time, we need to actually define, and implement, a set of performance metrics for the scheduler(s).

Event Timeline

olasd changed the task status from Open to Work in Progress.Jan 18 2021, 2:17 PM
olasd triaged this task as High priority.
olasd created this task.
olasd moved this task from Backlog to in-progress on the Sprint 2021 01 board.

Some potentially interesting and "easy" metrics:

  • "origin coverage": Number of origins that have never been visited (lower is better)
  • "visit usefulness": Number of uneventful visits per unit of time (lower is better)
  • "origin lag": (average?) Time between last origin activity and loading time (lower is better)

More speculative metrics:

  • optimizing for "visit regularity/smoothness/smallness": sum of the square (or any other supra-linear function) of the duration of each visit (by unit of time); in combination with the "visit usefulness" metric, this could give us a sense of the amount we're lagging behind any given origin: a shorter (eventful) visit means that the archive is lagging less behind any given origin.
  • "origins with pending changes": Number of origins where last_visit < last_activity (lower is better)

thanks, looks a good starting point.

  • "'outdatedest' origin": excluding disabled origins and origins visited after their last_activity (if any), the min(current_time - last_visit) (lower is better)