HomeSoftware Heritage

buffer: add a threshold for the number of directory entries in one batch

Description

buffer: add a threshold for the number of directory entries in one batch

The size of individual directories is essentially unbounded. This means
that, when the buffer storage is used as a way of limiting memory use
for an ingestion process, it is still possible to go beyond the expected
memory use when adding a batch of (very) large directories.

The duration of the database operation for directory_add is also
commensurate to the number of directory entries added in a batch, so
using the buffer proxy to limit the time individual database operations
takes was not effective.

Adding a threshold on cumulated number of directory entries per batch
makes this overuse of memory and of database transaction time much less
likely.