Code Structure
The preservation services code repository includes the following directories and files:
Name | Description |
---|---|
.github | Contains config files for GitHub services like Dependabot. |
apps | Contains Go source files with a main() function that can be compiled into executable binaries. There's one directory and one file for each app. |
audit | Contains code to run a quick, lightweight audit of files in preservation storage. |
bagit | Contains code used to build, parse, and validate BagIt packages. |
bin | Contains Mac and Linux binaries for Minio, NSQ, and Redis. These are used in integration and end-to-end (e2e) tests. |
cfn | Contains a template (cfn-preserv-cluster.tmpl ) and a yaml file (cfn-preserv-cluster.yml ) for deploying preservation services to AWS's Fargate/ECS service. |
constants | Contains constants used throughout the codebase. |
deletion | Contains code used by the deletion worker to permenantly remove files from preservation storage. |
docker | Probably no longer used. Consider deleting this. |
e2e | Contains end-to-end tests. See End to End Tests. |
fixity | Contains code used by the fixity worker to run fixity checks. |
ingest | Contains code used by ingest workers to ingest bags. |
models | Contains a number of data models. See the following three entries for details. |
models/common | Contains utility models used throughout preservation services. The most important of these are Config, which provides access to configuration settings and Context, which provides access to essential services such as logging, Registry, Redis, NSQ, S3, Glacier and Wasabi. |
models/registry | Contains models that match core Registry models (intellectual object, generic file, work item, checksum, etc.) These models include extra fields for metadata used during the ingest process that Registry itself doesn't care about. Data in these models is stored in Redis, so it can be shared among workers. |
models/service | Contains models specific to preservation services' internal processing. Data in these models is stored in Redis, so it can be shared among workers. |
network | Contains code to access external network services, including NSQ, Redis, Registry and Glacier. |
platform | Includes platform-specific code for Posix and Windows operating systems. (This code is currently unnecessary, but may become necessary if we decide to share bagging and validation code with Windows in the future.) |
profiles | Contains BagIt profiles supported by Preserv's ingest process. Currently, that's limited to the latest versions of the APTrust and BTR (Beyond the Repository) profiles. The profiles are in DART format, which is richer and more specific than the standard format. DART can convert these back to standard format using its export fuction. This directory also contains the default.sig file used by Siegfried, which is the core of the format identification worker. |
restoration | Contains code used by the Glacier, file, and object restoration workers. |
scripts | Contains two Ruby scripts. build.rb builds the entire suite of Preserv apps, writing the executables into bin/go-bin . test.rb runs unit, integration, and end-to-end tests, which require spinning up a number of locally running external services. See the testing documentation for details. |
testdata | Contains a number of files used in unit/integration/e2e tests. These include files to be bagged, bags to be validated, and JSON files in e2e_results to be matched against the expected outcomes of end-to-end tests. |
util | Contains utility functions used throughout the codebase. |
workers | Contains code to harness the contents of the deletion, fixity, ingest, and restoration directories into usable, NSQ-connected workers. These workers are then loaded by the apps in the apps directory. See Anatomy of a Worker for details on how these pieces fit together. |
.env | These are settings files for different environments (dev, test, integration, etc.) See Settings or the comments in the files themselves for info about which settings are available and what they mean. |
Dockerfile.build | This file contains instructions for building Docker containers in which to run preservation services. Use make to build the containers, as described on the Docker page. |
Makefile | Includes commands to build and publish the Docker containers, and to update the CloudFormation template. See Docker |
docker-compose.yml | This is an historical artifact used in early, proof-of-concept Docker builds. We may use it as a refence if we move to Kubernetes. |