The initial shots of The Frost capture a strange and unsettling atmosphere. The presence of vast icy mountains, a makeshift military-style camp, a group of people gathered around a fire, and barking dogs creates a mix of familiarity and unease. Something feels off.
In one scene, a person asks for the tail, followed by a close-up shot of a man near the fire chewing on a pink piece of jerky. The way he moves his lips is disturbingly unnatural, almost as if he’s gnawing on his own frozen tongue.
Welcome to the eerie realm of AI-driven filmmaking. Stephen Parker, from Waymark, a video production company based in Detroit, explains that they embraced the peculiarities of the AI model called DALL-E, rather than fighting its desire for precise realism. The Frost, a 12-minute film, stands out as an impressive and bizarre example of this emerging genre. You can watch the exclusive reveal of the film below, provided by MIT Technology Review.
To create The Frost, Waymark used a script written by Josh Rubin, an executive producer at the company who also directed the film. They fed the script into DALL-E 2, an image-generation model developed by OpenAI. After some experimentation to achieve the desired style, the filmmakers relied on DALL-E 2 to generate every single shot. To bring these still images to life, they used an AI tool called D-ID, which adds movement to static pictures, enabling blinking eyes and moving lips.
Rubin explains that they built a world based on the images produced by DALL-E. Although it resulted in a peculiar aesthetic, they embraced it wholeheartedly, and it became the defining look of the film.
Independent filmmaker and Bell & Whistle co-founder Souki Mehdaoui comments on the consistency of the film’s style, noting that it is the first generative AI film he has seen with such coherence. The combination of generating still images and animating them lends The Frost a captivating collage-like atmosphere.
The Frost joins a series of short films released in recent months that utilize various generative AI tools. Due to limitations in generative video models, these films range in style and technique. Some feature storyboard-like sequences of still images, as seen in The Frost, while others incorporate a montage of short video clips.
In New York, Runway, a company specializing in AI video production tools, organized an AI film festival in February and March. Notable films include the otherworldly PLSTC by Laen Sanches, which showcases a mesmerizing sequence of peculiar sea creatures wrapped in plastic, generated by the Midjourney image-making model. Another film, Given Again by Jake Oleson, utilizes NeRF (neural radiance fields) technology to transform 2D photos into 3D virtual objects, creating a dreamlike atmosphere. Expanded Childhood by Sam Lawton offers a surreal and nostalgic experience by extending Lawton’s old family photos using DALL-E 2, allowing him to play with the blurred memories of those images.
Lawton captures his father’s reaction to the extended images in the film, as his father remarks, “Something’s wrong. I don’t know what that is. Do I just not remember it?”
Quick and affordable
Artists are often at the forefront of exploring new technologies, but the advertising industry is currently shaping the immediate future of generative video. Waymark, a company that creates video production tools for businesses seeking quick and affordable commercial-making solutions, produced The Frost as an experiment to integrate generative AI into its products. Waymark is among several startups, including Softcube and Vedia AI, offering customized video advertisements with just a few clicks.
Waymark’s current technology, launched earlier this year, combines various AI techniques such as large language models, image recognition, and speech synthesis to generate video ads on the spot. Additionally, Waymark leveraged its extensive dataset of non-AI-generated commercials created for previous clients. CEO Alex Persky-Stern explains that they selected the best videos and trained the AI on what constitutes a good commercial.
Using Waymark’s tool, available through a tiered subscription service starting at $25 per month, users only need to provide a business name and location. The tool scrapes text and images from the business’s websites and social media accounts, then utilizes that data to generate a commercial. OpenAI’s GPT-3 writes a script, which is read aloud by a synthesized voiceover, accompanied by selected images showcasing the business. A polished one-minute commercial can be generated in seconds. Users have the option to edit the result, adjusting the script, modifying images, selecting a different voice, and more. Waymark reports that over 100,000 people have utilized its tool thus far.
However, Waymark acknowledges the challenge posed when a business lacks a website or images to utilize. Stephen Parker mentions that professionals such as accountants or therapists might have no assets at all. As a solution, Waymark’s next plan involves using generative AI to create images and videos for businesses that either lack them or choose not to use their existing assets. Parker states that this was the driving force behind creating The Frost—to establish a world and a specific atmosphere.
While The Frost exudes a distinct atmosphere, it also carries a rough and imperfect quality. Josh Rubin acknowledges that generative AI is not yet a perfect medium, mentioning struggles with certain aspects, such as capturing emotional responses in facial expressions using DALL-E. However, there were also moments of delight and wonder when the AI produced results that amazed the filmmakers.
The hit-and-miss nature of the process will undoubtedly improve as the technology advances. DALL-E 2, the AI model employed by Waymark for The Frost, was released just a year ago, and video generation tools capable of producing short clips have only been around for a few months.
One of the most groundbreaking aspects of this technology is the ability to generate new shots on demand. Josh Rubin shares his experience of editing the film, needing specific shots like a close-up of a boot on a mountainside. With DALL-E, he could summon the desired shot almost instantly, which he describes as a mind-blowing and eye-opening experience for a filmmaker.
Chris Boyle, co-founder of Private Island, a London-based startup specializing in short-form video production, also expresses the transformative potential of image-making models. The company has produced commercials for global brands like Bud Light, Nike, Uber, and Terry’s Chocolate, as well as in-game videos for popular titles like Call of Duty. While Private Island has been using AI tools in post-production for a few years, their utilization significantly increased during the pandemic when shooting restrictions were in place.
Private Island adopted various technologies to streamline post-production and visual effects, including generating 3D scenes from 2D images using NeRFs and employing machine learning to extract motion-capture data from existing footage instead of starting from scratch.
Generative AI represents the new frontier for Private Island. Recently, the company posted a satirical beer commercial on Instagram,
created using Runway’s video-making model Gen-2 and Stability AI’s image-making model Stable Diffusion. The video, titled “Synthetic Summer,” gradually gained viral attention. It depicts a typical backyard party scene with carefree individuals enjoying their drinks in the sunshine. However, their faces have empty holes instead of mouths, the beer cans sink into their heads when they drink, and the backyard is engulfed in flames, creating a horrifying effect.
Chris Boyle explains that they enjoy utilizing the medium itself to tell the story, and “Synthetic Summer” exemplifies this approach. The unsettling and creepy nature of the medium itself visualizes some of our fears regarding AI.
Leveraging its advantages
Does this mark the beginning of a new era in filmmaking? The current tools available have their limitations. However, films like The Frost and “Synthetic Summer” effectively capitalize on the strengths of the technology behind them. The eerie atmosphere of The Frost aligns well with the capabilities of DALL-E 2, while the fast-paced cuts in “Synthetic Summer” accommodate the short video segments generated by tools like Gen-2. This adaptability opens up possibilities for the use of generative video in music videos and commercials. Nonetheless, apart from experimental artists and a few brands, the adoption of this technology remains limited.
The ever-evolving nature of the field poses challenges for potential clients. The rapid pace of technological advancements can deter companies from investing resources in projects. Moreover, legal concerns regarding copyrighted images used in training datasets, such as in Stable Diffusion, add another layer of caution for many businesses.
With the future of generative video uncertain, Mehdaoui emphasizes the need for more thoughtful consideration rather than making assumptions. Nevertheless, filmmakers continue to explore the potential of these tools. Inspired by Jake Olseon’s work, Mehdaoui is employing generative AI tools to create a short documentary aimed at destigmatizing opioid use disorder. Waymark plans to develop a sequel to The Frost, but with an open-minded approach to using new technologies. Private Island is also engaged in experimentation, combining ChatGPT-generated scripts with Stable Diffusion-produced images and working on a hybrid film involving live-action performers wearing costumes designed by Stable Diffusion.
Boyle expresses excitement about the emerging aesthetics driven by generative AI, contrasting it with the prevalent digital culture dominated by emojis and glitch effects. The potential of generative AI is seen as a fragmented reflection of ourselves, opening up new and thrilling avenues for artistic expression.