Adobe adds increasingly powerful video capabilities, Topaz, Moonvalley, and more to Firefly AI

Adobe adds increasingly powerful video capabilities, Topaz, Moonvalley, and more to Firefly AI

6:37

Ahead of Adobe’s announcement today of a whole range of powerful Firefly AI capabilities, we talked to the company’s VP of Product Management, GenAI (Firefly), Zeke Koch, about what’s coming down the AI pipe.

The dizzying speed of AI development shows no sign of letting up, in fact, it’s probably accelerating if the latest features added to Adobe’s Firefly Video Model are any guide.

Today, Adobe is announcing a raft of new features for the Firefly web app that make the model more capable than ever. And in between getting a demo of their capabilities, we got to ask VP of Product Management, GenAI (Firefly), Zeke Koch, some questions about what the company is up to with the technology and where it’s going.

Expanded models

First off, the new options and ability to use an expanding number of models within the Firefly app.

Recently, Runway’s Gen-4 Video and Google Veo3 (with Audio) were added to Firefly Boards, and Veo3 with Audio to Generate Video. Adobe has now said there are more partner models coming soon to the Firefly app. Topaz’s Image and Video Upscalers and Moonvalley’s Marey will be launching soon in Firefly Boards, while Luma AI’s Ray 2 and Pika 2.2, which are already available in Boards, will soon be added to Generate Video.

Adobe Firefly Partner Models

“Moonvalley is quite interesting because it's a generative video model where they have a legal right to train on all the assets that they train on, similar to Firefly’s models,” says Koch. “They've licensed a lot of cinematic content, so all the videos that come out from it have a really nice kind of cinematic field to them. Topaz Labs is really interesting too. We've been talking to them for a long time because they make what we think is the best upsampling, taking a lower resolution and making it higher resolution.”

Apart from the desire to fold in best-of-class tools into its own platform, what’s interesting is the amount of care Adobe has taken to signal when Firefly is using a third party model, especially one where generation is not necessarily being done on pre-cleared content. Have there ever been any qualms internally about providing access to models such as Veo via Adobe platforms?

“We spent a lot of time internally talking about whether or not we thought this was the right thing for our customers and Adobe and the ecosystem in general,” says Koch. “When we first launched Firefly, we only supported our models, and we had a very strict stance on that. The primary reason why we changed is because all of our customers asked us to.”

Koch says that the main use for third parties is for ideation. Customers have been very quick to differentiate between client-facing work that they want to have all pre-cleared and approved with no potential legal nasties lurking in the future, and a more free-wheeling approach to internal ideation.

“Because that phase is so clearly different or professional, they were like, ‘Hey, these models are wild, but we're just using them for ideation. Can you allow us to do that using your tools? Because we love using your tools’.”

Improved video controls

Whether for ideation or delivering finished work, there are plenty of new tools for better video control.

Composition Reference for Video: Upload a reference video, describe what you want to see, and Firefly will generate a new video that transfers the original composition to your generation. This is going to be useful for maintaining a consistent look from scene to scene.

Adobe Firefly Style Presets

Style Presets: Apply a distinct visual style to your video with a single click, choosing from presets such as claymation, anime, line art, or 2D to instantly set the tone.

Keyframe Cropping - Upload your first and last frames, select how your image will be cropped, describe your scene, and Firefly will generate a video that fits your format.

Sound and avatars

Generate Sound Effects

Elsewhere the new Generate Sound Effects, currently a beta release, makes it easy to create custom sounds, like a lion’s roar or ambient nature sounds. These can be done via a simple text prompt to generate the sound effect you need. Or you can even use your own voice to guide the timing and intensity of the sound.

“Firefly listens to the energy and rhythm of your voice to place sound effects precisely where they belong — matching the action in your video with cinematic timing,” says Adobe.

A demo of this is quite compelling and produces a convincing zip opening noise and even transposes the words ‘clip clop’ into convincing footsteps over a variety of surfaces.

There's also a new text to avatar feature that will turn scripts into avatar-led videos in just a few clicks. We suspect you’re going to see these popping up in a lot of places fairly quickly.

Prompt attention

One final thing that caught our attention was a new Enhance Prompt feature. This takes your original prompt and adds additional language to finesse it and make it more exact, taking a prompt from a simple sentence to a whole paragraph. You can then tweak that text to detail precisely what you want, removing and editing elements as you go.

It’s interesting in that it uses AI to improve written language before feeding that output into another AI to generate the image.

“Most all the models we think are doing something like this behind the scenes,” says Koch. “When we first designed our model, the stuff that was coming back was looking kind of generic. We saw that the best prompters were creating amazing stuff, and the average prompter was creating something that was kind of ‘stocky’. Rather than doing it invisibly, where you don't have any control over it, in that same theme of giving creatives control we basically show it to you. The idea is that it’s both teaching you how to do it better and also allows you to get rid of certain features that you might not like.”

Which makes us think that there’s possibly a stage in there that could be eliminated. Are we perhaps nearing the end of this initial prompting era when it comes to working with generative AI?

“It's a great question, and we will definitely be talking more about that next, or maybe the time after,” says Koch, teasing a look at the future Firefly roadmap. "I think that humans largely communicate using a mixture of words and gestures, and so I think we're never getting rid of prompting. But I do think we're getting to the end of the era where you are solely prompting. The future is definitely multimodal.”

Tags: Post & VFX AI Adobe Firefly