Section 3.3

SOFTWARE-DEFINED WORKFLOWS

Building on the first eight principles, which provide a common core of cloud infrastructureSince releasing the paper, MovieLabs has updated it’s definition of cloud to An internet connected collaborative workspace which could be provided on public hyperscaled cloud services, on-premises or private clouds in datacenters or hybrids of them all. and a consistent system to track assets and assign projects to real people, all in a secure environment, now we can move on to iterations in the actual workflows that power our industry. The next two principles pertain to software-defined workflows, i.e., dynamic and modifiable connected services that operate on off-the-shelf hardware instead of specialized, custom devices. These applications are delivered in modern companies today as microservices (although 2030 software systems may have evolved still further). Regardless of how they are delivered, our objectives are to encourage new creative tools and workflows that allow more creative expression and faster innovation.

EXTENDED INSIGHTS: Exclusively Online

Building the Future of Media Production

Watch Jim Helman introduce the concepts behind Software-Defined Workflows

PRINCIPLE 9: MEDIA WORKFLOWS ARE NON-DESTRUCTIVE AND DYNAMICALLY CREATED USING COMMON INTERFACES, UNDERLYING DATA FORMATS AND METADATA

OVERVIEW

As technology advances at a blistering pace, the demands to constantly redesign production pipelines to accommodate new technologies are becoming untenable. Furthermore, dependence on an agglomeration of legacy tools results in a fragile environment, susceptible to failures that can ripple throughout an entire production process. This principle mitigates these issues by establishing standardized building blocks for workflow processes with common data file types, descriptive metadata and interfaces for applications to interface with those systems. By adopting a modular methodology for production pipelines, creatives can quickly construct and adapt workflows from these building blocks. The blocks will each have their own defined minimum data, metadata, input formats and output formats and will easily communicate with each other using consistent underlying data systems.

We also include the notion of non-destructive workflows; that is, whenever possible, the original asset is maintained in its original state. Production processes layer modifications that are described in metadata files. In that way, the original assets can always be retained and any changes or enhancements regressed back by peeling away layers of modifications.

We can envision an industry interface layer, likely in the form of a series of standardized application programming interfaces (APIs)2. Workflows would consist of processes exchanging assets data and associated metadata through this interface layer with other processes. This model would support a marketplace-style environment where providers could compete to offer the best components and/or services as modules that plug into a specific workflow/pipeline. Content creators can select, mix and match any of these services to design a workflow or swap them out without having to redesign their pipeline from scratch.

Figure 2: Standardized building blocks contain agreed common data and metadata for key processes in production. The API interface layer abstracts that information such that any application, tools, portals or service can interface through the API layer with any other tool without needing to know about it in advance.

Figure 2: Standardized building blocks contain agreed common data and metadata for key processes in production. The API interface layer abstracts that information such that any application, tools, portals or service can interface through the API layer with any other tool without needing to know about it in advance.

Such a system would drastically shorten the “pipeline development” phase of a production, allowing rapid prototyping facilitated by intrinsic interoperability between the specific technologies with the environment. This would also empower the production team to remain agile with its technology choices. The production could change out any part of the workflow, from camera manufacturer to editing system, without an interruption to the production.

Each major process used in production – from on-set and dailies down to dubbing and mastering – uses distinct file types and metadata as inputs to the creative processes. The resulting work may be altered assets (e.g., a composited image) or altered metadata (e.g., an editorial decision list or EDL). For the interface layer to work most effectively, we will need to describe both the standardized asset files and a minimum amount of metadata. Similarly, there are certain standardized outputs (e.g., data, metadata) for each production subprocess, which are contained in the data storage. Each content creator may have their own additional sets of metadata that they will want to track on a per-project basis, and the system will need to accommodate such ancillary datasets.

Over the years, the list of standardized metadata and data for the building blocks may grow, but for now, there are enough similarities in production processes that an initial set of data and an extensible metadata schema can be developed across studios and productions to start to bring some order to the chaos.

EXTENDED INSIGHTS: Exclusively Online

Current MovieLabs Projects Delivering on Software-Defined Workflows

MovieLabs has a number of ongoing programs to deliver on the these principles of interoperable media workflows

See the Projects in Action

EXAMPLES

A new company is formed that creates a new niche production tool to track color management information from on-set. Instead of creating its own API, the company plugs into the existing interface layer, which now quickly and easily allows any content creator to integrate the tool and immediately begin ingesting information.

A VFX vendor delivers element packages back to the studio for archive. The interface layer would understand the various elements (3D models, textures, etc.) and extract the metadata from each asset to present to the studio’s databases for further data processing. Because this metadata extraction has been standardized, the studio does not need to create its own normalization of the data, as the VFX building blocks already contain typical data fields for each asset.

We cannot hope to predict every nuanced data field that may be required for future productions, but by defining an extensible schema, we may accommodate a new camera technology in the capture building block and allow it to interface with legacy software applications with no prior knowledge of the new technologies or file types.

The standardized building blocks coupled with an industry-wide interface layer enables the best of both worlds: economies of scale from consistency, and freedom and creativity for each production.

IMPLICATIONS

The standardized building blocks coupled with an industry-wide interface layer enables the best of both worlds: economies of scale from consistency, and freedom and creativity for each production. By creating an industry standard interface layer, we can ensure that any number of web applications, creative tools, bots and other software that understand the underlying files, their structure and associated metadata can be developed.

By creating standards for consistent media and metadata nomenclature, hierarchy and storage interfaces, we can define what is the same for every production (scripts, production notes, video files, audio files etc.) and where they can be found in the cloud.

By 2030, we may have an entirely object- and metadata-based storage system, which means filenames become irrelevant. In the interim, however, an early step to building-block workflows would be to at least normalize the naming systems with an open data model so that productions could be consistent in how they describe foundational pieces, such as a scene or take, and how they name a 3D asset’s mesh versus its other component pieces.

We can envision physical studio facilities adapted to support these building blocks. They would have the ability to configure equipment and software to perform certain functions in a room one day, and the next day, rapidly modify the configuration using different building blocks to perform a different function.

Rather than predict where artificial intelligence will have the biggest impact in the production and distribution of media, we are focused on creating an environment that enables those AI tools to work most effectively. By providing the industry with structured datasets from these building blocks, which can be provisioned for read-only access, AI and ML bots crawling for data and looking for optimizations to workflows and processes will be able to much more easily resolve what data they are scouring. These automations can enable considerable efficiencies throughout production by reducing mundane and repetitive tasks. However, they need to work from a structured dataset.
Rather than predict where artificial intelligence will have the biggest impact in the production and distribution of media, we are focused on creating an environment that enables those AI tools to work most effectively.

EXTENDED INSIGHTS: Exclusively Online

The MovieLabs Ontology for Media Creation (OMC)

Since the publication of the 2030 Vision, MovieLabs has been actively building the OMC, the extensible metadata schema mentioned in this document.

Explore the OMC in detail: from the current version to its features and scope

PRINCIPLE 10: WORKFLOWS ARE DESIGNED AROUND REAL-TIME ITERATION AND FEEDBACK

OVERVIEW

Currently many creative processes happen without real-time feedback. For example, VFX renders can take 24 hours or more to create final composited frames. This makes iteration slow. However, in late 2018, the industry saw dramatic improvements in the quality of video game engines with the addition of hardware-accelerated GPU-based ray-tracing on affordable workstation and cloud graphics cards. In the future, a new suite of filmmaking tools will evolve from today’s game-creation engines – so as not to confuse these tools with game creation processes, we refer to them as real-time engines (RTEs). These new tools plus the new cloud foundation principles will dramatically change the creation and economics of filmmaking, potentially upending the sequence of creation, and enable new workflows at preproduction, production, postproduction and potentially even delivery of filmed media in the future.

Today’s (2019) game engines are increasingly used to previsualize sequences of films (usually the complex action-packed scenes). In some cases, entire movies have been previsualized on a game engine system. Game engine renderers have also been used to create final cut renders on some movies. Beyond these examples, there are many more opportunities for these real-time iterative workflows in the future.

EXAMPLES

Preproduction

By 2030, traditional camera-based productions will look increasingly like the workflow for the animated features of today, that is, a world without unneeded setups or unproductive production days. With animation today, the processes are iterative. The movie is first storyboarded out scene by scene. Then it is performed with “scratch audio,” and finally with increasingly advanced animatics. At all times, the director can arrange and rearrange the scenes, timing, characters and dialogue. With subsequent revisions, the movie is increasingly locked as it is, then performed with final voice talent, animated with full fidelity and rendered with realistic lighting. We foresee the early production steps for live-action movies using a similar approach and RTEs. The show, potentially with interim actors, can be designed, animated, correctly lit and edited and have accurate camera angles set in the pre-photography stage and potentially before it is greenlit. The quality of the title can iterate in this pre-photo stage and assist a production in making complex decisions and spotting potential issues before they reach a critical state and move to principal photography.

Principal Photography

We do not just foresee RTE tools being used in pre-photography; we can also expect to see those tools being used during production as content is being shot with traditional cameras. By using RTE tools combined with XR technologies, directors will be able to see photorealistic versions of digital characters or objects interacting on-set with physical actors and objects. By looking through the camera lens or via head-mounted displays (HMDs), the cast and crew will no longer see green screens or stand-in representations of digital characters. Two physical actors could act opposite each other despite being thousands of miles apart. A director and cinematographer would be able to make lighting and camera decisions with absolute confidence, knowing how the final output of the scene will look with both digital and physical elements blended seamlessly together.

A director and cinematographer would be able to make lighting and camera decisions with absolute confidence, knowing how the final output of the scene will look with both digital and physical elements blended seamlessly together.

Postproduction

The process of postproduction and VFX in 2030 could be shorter than it is today because the real-time engine will take away the pain of waiting for lengthy and expensive offline render farms to finish before artists can see with confidence the results of their work. The production could also deliver final plates to the VFX vendors with digital objects or characters already composited in, mitigating much of the work required in current postproduction processes. For some smaller productions, these RTE-rendered scenes may be fine to use for final rendered pixels. These time and cost savings could be used to reallocate budget elsewhere, perhaps forward to pre-photography, or to allow more time for iteration and improve the final product.

Distribution

We can also envision scenarios in which the rendering step occurs on the consumer’s device or, just before that, at the edge of the internet during distribution (using perhaps a cluster of GPUs in local neighborhoods with burstable capacity to handle shared graphics compute in a contended way, in much the same way that bandwidth is contended among neighbors in a local zone now). This enables a world of dynamic media that adapts to the consumer’s unique playback environment. If consumers are engaging in more immersive entertainment experiences (like video games), then it is possible that the finished form of the media is not a single piece of narrative video, but an entire CGI environment that contains the storylines, objects and characters the director wants to use. That experience could change and react to the audience and their viewing device (for example, to match the native display resolution, color gamut, and frame rate) and not be fixed every time it is consumed – much like a video game today.

Because all the required assets will already be securely in the cloud (Principles 1 & 7), linked to each other in asset packages (Principle 8) and designed with a new policy approach to publishing (Principle 3), we can then confidently unlock the power of edge compute to deliver these new experiences.

Hammerspace and Mathematic

EXTENDED INSIGHTS: Exclusively Online

See Principle 10 in Action in the MovieLabs Showcase

Mathematic Accelerates Productions, Reduces Costs and Goes Green with Hammerspace

…the adage “we can fix it in post” may change to “we can fix it before we shoot.”

IMPLICATIONS

These examples illustrate how the adage “we can fix it in post” may change to “we can fix it before we shoot.” We can also expect changes in the structure and scheduling of major productions, with perhaps less time being devoted to postproduction; instead, those people, budgets and time will be shifted forward to much more robust and fully formed visualizations in pre-photography.

However, the range of additional media that could be captured may mean postproduction evolves into new processes, such as selecting final camera angles and performances from a range of viewing angles that were captured on set.

A new open standard real-time engine package would need to be developed that could deliver the range of digital assets to the consumer to be rendered in real time as they watch the experience.

The impact of the RTE will be considerable and broad, but we can see the need for standardization of some core components of the rendering, translation and packaging of digital assets, and that will require broad industry collaboration across tools, vendors, GPU providers and creatives.

[2] Such industry-wide agreed and open APIs can be developed in a secure manner, as has been done by the banking industry with the global BIAN network. See https://bian.org/