N.B. This is NOT a sponsored article. We've published it because it contains genuinely useful information about the increasingly important field of AI in post production
At Digital Anarchy, we dove into the world of AI head first. So I’m pretty familiar with where the state of the industry is right now. We’ve been neck deep in it for the last year.
AI is definitely changing how editors get transcripts and search video for content. Transcriptive, a product we’ve developed, demonstrates that pretty clearly with text. Searching via object recognition is something that is also already happening. But what about actual video editing?
One of the problems AI has is finishing. Going the last 10%, if you will. For example, speech-to-text engines, at best, have an accuracy rate of about 95% or so. This is about on par with the average human transcriptionist. For general purpose recordings, human transcriptionists should be worried.
But for video editing, there are some differences, and that’s good news. First, and most importantly, errors tend to be cumulative. So, if a computer is going to edit a video, at the very least, it needs to do the transcription and it needs to recognise the imagery. (we’ll ignore other considerations like style, emotion, story, for the moment). Speech recognition is at best 95%, object recognition is worse. With more layers of AI, those errors will usually multiply (in some cases there might be an improvement, though). While it’s possible automation will be able to produce a decent rough cut, these errors make it difficult to see automation replacing most of the types of videos that pro editors are typically employed for.
Secondly, if the videos are being done for humans, frequently the humans don’t know what they want. Or at least, they’re not going to be able to communicate it in such a way that a computer will understand and be able to make changes. If you’ve used Alexa or Echo, you can see how well AI understands humans. Lots of situations – especially literal ones (“Find me the best restaurant”) – work fine, lots of other situations, not so much.
Many times, as an editor, the direction you get from clients is subtle or you have to read between the lines and figure out what they want. It’s going to be difficult to get AIs to grasp the way humans usually describe what they want, figure out what they actually want and make those changes.
Thirdly, you get into the whole issue of emotion and storytelling, which I don’t think AI will do well anytime soon. The Economist recently had an amusing article where it let an AI write the article. The result is here. Very good at mimicking the style of The Economist, but when it comes to putting together a coherent narrative… ouch.
Human transcriptors have cause for concern, but only up to a point
It’s not all good news
There are already phone apps that do basic automatic editing. These are more for consumers who want something quick and dirty. For most of the type of stuff professional editors get paid for, it’s unlikely that – what I’ve seen from the apps – it will replace humans any time soon. Although I can see how the tech could be used to create rough cuts and the like.
Also, for some types of videos, wedding or music videos perhaps, you can make a pretty solid case that AI will be able to put something together soon that looks reasonably professional.
You need training material for neural networks to learn how to edit videos. Thanks to YouTube, Vimeo and the like, there is an abundance of training material. Do a search for ‘wedding video’ on YouTube. You get 52,000,000 results. 2.3 million people get married in the US every year. Most of the videos from those weddings are online. I don’t think finding a few hundred thousand of those that were done by a professional will be difficult. It’s probably trivial, actually.
Same with music videos. There IS enough training material for the AIs to learn how to do generic editing for many types of videos.
For people that want to pay $49.95 to get their wedding video edited, that option will be there. Probably within a couple years. Have your guests shoot the video, upload it and you’re off and running. You’ll get what you pay for, but for some people, it’ll be acceptable. Remember, AI is very good at mimicking. So the end result will be a very cookie cutter wedding video. However, since many wedding videos are pretty cookie cutter, anyway… at the low end of the market, an AI edited video may be all ‘Bridezilla on a Budget’ needs. And besides, who watches these things anyway?
Let the AI do the grunt work, not the editing
The losers in the short term may be the assistant editors. Many of the tasks AI is good for… transcribing, searching for footage, etc.. is now typically given to assistants. However, it may simply change the types of tasks given to assistant editors. There’s a LOT of metadata that needs to be entered and wrangled.
While AI is already showing up in many aspects of video production, it feels like having it actually do the editing is quite a way off. I can see creating AI tools that help with editing: rough cut creation, recommending colour corrections or B-roll selection, suggesting changes to timing, etc. But there will still need to be a person doing the edit. It’s still an art form and the editor plays such a crucial role in the actual storytelling of a film. It’ll be a long time before any IA or machine learning engine can replace that. And that’s a good thing.