We currently use an incrementing integer to uniquely identify origins.
This does not work well with a distributed database (eg. Cassandra), and is not an intrinsic identifier like most of the archive.
So we should define a new identifier for origins. Current options:
- A 2-tuple: (type, url). Pros: useful information can be derived that identifier without an API request.
- A hash of the type and url. Pros: fixed-size and compact