The GitHub API allows to inspect when a repo has been last modified, see updated_at/pushed_at fields in this example.
Given how significant GitHub is in our archive coverage it makes sense to add a forge-specific optimization that skip loading repos for which those timestamps are older than our last visit of the corresponding origins.
(Note: I'm not exactly sure what the difference among the two fields are; I'm assuming pushed_at is for git push and updated_at for metadata changes. But I think even the most conservative approach, skip only if both fields are older than our last visit would be a good start.)
Assuming that doing an API call at the loader level is faster than actually trying to load the repo (which seems obvious to me, but it's not like I have actually benchmarked it *g*), this optimization should help a lot in clearing our backlog of repos to re-visit, for all GitHub repos that haven't changed.
I'm not sure where this forge-specific optimization belongs, but if it worse it's something we're can extend in the future to, e.g., GitLab.