Project Description

SIPs in a Presidential Timeline – NARA DATR

Impact

0M+
Considerable cost savings
0M+
Clean, intact data
0
Compliance with NARA and federal standards
0%
Excellent security

Overview

NARA needed to completely validate data in the archives from President Barack Obama’s (44) administration prior to transferring it to its new cloud-based home, the Executive Office of the President (EOP) 44. Softek answered this critical by providing detailed validation of all elements and submission ingest packages (SIP) using a gateway, the Data Transport and Control (DATR) environment.

Challenge

NARA needed to transition the Obama Administration’s paper and digital archives to a newly modernized, more secure cloud-based platform. The Obama Library worked diligently to digitize 30 million paper textual records (DTR). Each DTR had to be formatted into submission ingest packages (SIPs) for compliance with NARA’s governance model. To achieve this, NARA needed help to:

  • Define a SIP specification,
  • Coordinate transfer and acceptance of SIPs,
  • Validate SIPs against the SIP specification, and
  • Make any required changes to ensure the accessibility of each DTR in the EOP.

These activities take place within the DATR environment – a safe space for data validation, inventory, and virus control prior to entering Dev/Test. Ensuring transmission and ingestion with 100 percent integrity was essential.

Solution

Softek designed and deployed the DATR dev/test environment to closely mirror that of the EOP 44 cloud dev/test and allow for isolation and validation of incoming SIPs. Validation included:

  • Initial compliance,
  • Anti-virus screening,
  • Fixity and inventory,
  • Archival arrangement,
  • Data format and Structure Validation, and
  • Image attribute validation.

Softek placed DATR in an optimized AWS GovCloud between NARA and the Obama Foundation, and included development, test, tools, scripts, and services for conducting automated transfer, acceptance, and quality controls. We worked with the Obama Foundation library to create the an optimal SIP design, including content and controls, to meet NARAs EOP SIP specifications.

Softek automated specification checks in AWS Lambda to create triggers to move SIPs through each validation gate while generating real-time progress metrics. Metrics were visualized with dashboards and provided information such as the number of SIPs expected, received, and validated. When Invalid SIPs were identified, like fixity, inventory, image quality, etc., an issue summary was automatically generated to provide a fix. Softek also:

  • Optimized access management activities with AWS IAM,
  • Provided data encryption at rest and in transit with AWS KMS and AWS ACM, and
  • Enabled auditing with AWS CloudWatch.

Results

Softek leveraged the AWS GovCloud by taking advantage of its capacity-based pricing model and flexibility. This created cost savings as NARA’s ingest loads were sporadic, but high capacity, and the service remained zero-cost when not in use.

Our expertise in the cloud allowed us to create systems with tightened security to ensure data was transferred and had no chance of a virus intrusion or security breach.

We also provided outstanding validation of SIPs and ingestion, resulting in zero loss of data and no loss of integrity.

Related Projects