Part 1 – A new approach to Archiving

Posted on May 14, 2024
By Mark Turner
The 2030 Vision – The Evolution of Media Creation included two principles dedicated to the future of archives, namely:
MovieLabs 2030 Vision Principle 4: Archives are deep libraries with access policies matching speed, availability and security to the economics of the cloud
MovieLabs 2030 Vision Principle 5: Preservation of digital assets includes the future means to access and edit them

In this blog series we will dig deeper into these principles, describe how we see the evolution of today’s archives into a new “2030 Archive”, discuss how the other MovieLabs 2030 Vision principles apply, and we’ll introduce new 2030 Archive Principles for those looking to build systems and processes in this new way. We’re expecting big changes in long term storage from the cloud, new security approaches, and software defined workflows, so let’s look at how these will come together to transform the media archive and media archiving workflows to enable easier management, preservation, optimization and monetization of deep library assets.

Background

We recognize that any new approach to archiving critical studio assets needs to demonstrate that it can be trusted to provide certainty of preserved assets over a massively long time period. The way to provide this trust is to enable full visibility over all assets in the archive and answer questions like:

  • What assets do we have?
  • How are they being managed?
  • What is their current status and location?
  • Are they being maintained in accordance with our policies?
  • How do I retrieve them?
  • How can I be protected from failures in cloud infrastructures or networks?

We believe new and emerging technologies can provide this trust and certainty now (if designed and configured appropriately) and enable a much more flexible, accessible and efficient archive into the future.

The 2030 Vision is based on the premise that all assets go straight to the cloud1 as they are created, applications come to them (instead of the other way around), and workflow tasks and access are controlled by timed and secure permissions within a software defined workflow. Applying these principles enables a different approach to archive – one that can match the longevity we have had using film as an archival medium2 and also enable new levels of flexibility that have not been possible to date.

Archiving Use Cases

In this future archive we can protect assets, metadata and even workflows themselves for use cases that are required by extreme longevity and stability. Archival Use Cases can be aggregated into 4 core types:

  1. Preservation – Safeguarding financial and historical/cultural value by maintaining the integrity of the original assets that are retained in perpetuity without degradation.
  2. Reference – Accessing media essence (or its proxy) and metadata to inform or help enable a new creative expression.
  3. Restoration – Bringing back what was stored, correcting any defects that may be present and re-storing.
  4. Repurpose – Taking the original essence or metadata and applying changes for a different release; either as a new expression or due to a format change (e.g., a Director’s cut, extended version, or HDR remaster), or by reusing some of the original in a new title (e.g., a flashback scene).

Each content owner decides how to organize its growing digital asset repositories, its policies with regard to what is considered “archival”, and how those policies should be applied. Today, assets are often duplicated (intentionally or unintentionally) across different storage media, using different asset management or metadata systems, and by different teams within the organization. This approach spreads risk but results in archives and servicing teams that are optimized for certain use cases – for example a “servicing archive” of final distribution assets, a “digital backlot” with key 3D assets, a “preservation archive” typically of the highest quality original assets, etc.

While this approach has some advantages, it also creates complexities including unnecessary duplication of assets (and associated costs), version control, mismatched metadata, inconsistent policies for managing assets, duplicative work, and additional processes needed to manage and communicate between individual archiving teams.

Benefits of the new 2030 Archive Approach

As we have synthesized various studio opinions on the future of archiving, we see multiple advantages in a new approach:

  1. Visibility – The new 2030 Archive can provide a singular view across both assets and metadata, regardless of where they are stored. We should stress here, single source of truth doesn’t mean a single copy of assets or metadata – in fact a common preservation policy system would enable a Service Level Agreement (SLA) for assets stored in multiple locations, across multiple clouds with managed redundancy of assets and metadata. With a singular approach, all duplicates can be addressed in a single way, as if they were all physically present in the same cloud, even though they are intentionally distributed for resiliency (and that can and often will include a physical offline copy for redundancy against cloud or network hacks). This ‘logical’ approach to multi-cloud and disparate copies will allow for a single system to “resolve” where all copies of files are and provides ease of access and cost flexibility.
  2. Accessibility – With an archive copy fully online in the cloud(s), workflow teams can much more easily use the archive like a reference library and, if allocated appropriate permissions, access the appropriate assets for Reuse, Restoration and Repurpose workflows. Creative teams would be able to directly search and find the assets they need, with self-service web portals3 or even directly within the creative tools they are using.
  3. Enabling Automation – By agreeing common cloud data stores and interoperable software defined workflows, the 2030 archive will be able to automate currently mundane and repetitive tasks such as manually ingesting assets into the archive, moving and duplicating assets to ensure compliance with preservation policies, testing for fixity, finding and issuing assets for Reference or Reuse use cases and more. Automation systems that understand preservation policies will be able to handle the work to ensure compliance with policies, allowing archivists to focus on curation and defining policies, rather than on library management. By using broadly accepted preservation policies to control access levels for different users and use cases, the archive systems can be dramatically simplified and yet still allow subtle nuance and control on a per asset level (or record level for metadata).
  4. Ease of Ingest – This policy driven approach will also simplify the initial ingestion of new production assets into the archive, currently a slow process involving production teams that are winding down and with dwindling resources. In the 2030 Archive assets can be tagged as archival and archival policies automatically applied as the assets are created (from the first concepts to the final masters), reducing the crush as production wraps to find and source all of the assets the studio decides it wants to archive, and not requiring human intervention in the process. Assets stored in legacy archive systems will need to be ingested into a cloud to enable these future scenarios which can be both an expensive and lengthy process but also a rare opportunity to evaluate, restore and augment assets with enhanced metadata and security.
  5. Economic Efficiency – All archival use cases in the 2030 Archive model will be supportable using a common shared infrastructure, reducing costs and simplifying systems. The different organizations that run discreet functions (distribution servicing, remastering, marketing, etc.) will still have nuanced control and discreet workflows, tools and applications to enable their access scenarios but with the cost benefits of shared storage and databases, versus duplicating all of those for their specific use cases.
  6. Consistent Security & Resiliency – Archival assets could be seamlessly managed across multi-cloud architectures for resiliency and protection with a consistent policy system which makes it easier to match performance to Service Level Agreements (SLAs). In addition, if studios are using the MovieLabs Common Security Architecture for Production (CSAP) – they could potentially use the same cloud security system for new productions as well as managing archival assets.
  7. Improved Responsiveness – Studio licensing, marketing and distribution teams will be able to see what they have across the archive, the associated rights and, potentially, even enable self-serve automation of media preparation for distribution. Having the archival media (or atleast one copy of it) stored in the cloud also allows other opportunities to rapidly take advantage of cloud technologies, such as AI tagging and other deep learning without having to ingest assets to the cloud just to take advantage of each new innovation4.

Where do we go next?

We believe the principles behind the 2030 Vision can help establish a new type of long-term archive that can enable all use cases and provide more resiliency, security and accountability, while reducing the time to access media and making archive management more efficient and less prone to error.

In subsequent parts of this blog series we’ll introduce key components in our approach to the 2030 Archive and then 2030 Archive Principles for those that want to actually create systems and processes to support this new approach.

Stay Tuned…

[1] We define cloud as any internet connected compute/storage infrastructure, which can include public cloud, private cloud, or a hybrid of them both. In fact, we expect, and design for, scenarios where assets are distributed amongst multiple clouds even though they can present logically as one “cloud.”

[2] Film has, after all, been used for the first 100 years of cinema and will continue to be as a physical archive into the future. Because it can be read back by simply applying light, film is also a great example of a simple and interoperable physical format. Even though an individual copy will degrade over time it can be duplicated, restored and many films have been remastered many times as new technologies have emerged (eg. 4K HDR). Initially film was just a distribution medium and was discarded but as studios built vaults for it and processes around it, film became this archival medium. We need to take the same care now to create this interoperable ecosystem about cloud based archives.

[3] See the MovieLabs Showcase, Marvel Studios Cinematic Universe Editorial Library, for a great example of how self-serve access to the archive enables creatives to access content that used to take days or weeks to source.

[4] MovieLabs preliminary research is indicating that a robust ontology would be useful for AI based learning models to extract and tag more precisely.

You May Also Like…