Skip to content

Cleanup

After an ingest has been successfully recorded, the cleanup worker does the following:

  • Deletes all of the interim ingest data from Redis
  • Deletes all of the temporary ingest files from the staging bucket
  • Deletes the bag (the tar file) from the depositor's receiving bucket
  • Marks the ingest WorkItem as complete in Registry

This is the only ingest work with no "next" queue. When it's done, it does not push the ingest WorkItem ID into another NSQ topic. It simply marks it as complete in the ingest09_cleanup topic, and that's the last NSQ hears of it.

Note

Cleanup does not run on ingests that fail due to transient errors. Those ingests can and should be requeued through the Registry when transient errors pass. (They are usually network errors or "service unavailable" errors.)

Resources

This worker uses little CPU and memory. It may issue a lot of deletion requests to the staging bucket, but even with those, it uses little network I/O.

External Services

Service Function
S3 Staging Bucket Worker deletes all temp files related to this ingest from the staging bucket.
S3 Receiving Buckets Worker deletes the tar file uploaded by the depositor from the receiving bucket.
Redis Worker deletes all metadata related to this ingest from Redis.
Registry Source of WorkItem record describing work to be done. Worker marks the ingest WorkItem complete.
NSQ Worker marks the NSQ task complete.

Source Files

Worker Service Files Definition
Cleanup Ingest Task
Worker
App
Cleans up all of the temporary resources created during the ingest process and deletes ingested bags from receiving buckets.