Yes up to 64GB of memory is impressive, but the real step change that the latest Mac Pro ushered in was the sheer amount of Floating Point Operations Per Second it could crank through. Guest Author Tommy Byrd on the importance of the FLOP.
By Tommy Byrd
When the new Mac Pro was released it ushered in a whole new era for the desktop computer. Fresh design, unique engineering, and a very simple look. It also came with some pretty impressive specs, including one that typically only gets mentioned when shopping for a new supercomputer; 7 Teraflops. In January, nVidia also impressed us when they announced their new mobile processor, the Tegra X1, bringing a teraflop to mobile devices. But what does that number mean and why should our readers pay attention to it?
What is a FLOP anyway?
FLOPS stands for Floating Point Operations Per Second and is a term used to measure a computer’s ability to do math; specifically its ability to do complicated math using numbers with lots of decimal places. That’s what the “Point” in Floating Point refers to. Why is this important? The real world doesn’t work in simple integers (numbers without decimal places). Scientific calculations – especially ones that try to map out what’s going on in the real world – require a computer that can work with numbers with countless decimal places. The more of these calculations a computer can do at the same time, the more accurate the results. The most common number people talk about here is Pi since it's one of the only “real” numbers most people know about.
General purpose CPUs (like a Core i7 or Xeon) are not known for the great FLOPS performance. This is by design since the main CPU in a computer has a lot of very simple tasks to do. Things like reading data from a hard drive and loading it into memory involve very little math in comparison to the tasks handled by GPUs made by nVidia and AMD. In a video game, if you move a character forward on the screen, the GPU does a bunch of quick math to decide how to display that new position on your screen.
This sounds simple, but remember that all this has to happen in realtime at smooth frame rates (30FPS+). In modern computers, even low-end displays have full HD resolution, which means 1920×1080 pixels on the screen at all times. This means the GPU is performing these complex math operations for 2 million+ pixels 30+ times per second. That’s over 62 million pixels being calculated every single second. Sometimes those calculations are very simple (the user interface as an example), but the more depth there is and the more realistic the resulting image, the more floating point math there is to calculate.
How FLOPS relate to video production
Historically you didn't need a dedicated GPU just to play back video. As long as you made sure your disks could feed the data fast enough, the video would display on the screen — there’s no real complex math going on there. As we started getting higher definition video, our disks couldn’t keep up with the amount of data we were feeding it and keeping all that uncompressed video on our hard drives was simply out of the question.
This works great to keep the quality of video high while keeping data rates and file sizes low, but now your computer has to perform some floating-point operations on that compressed data to display it properly. The CPU can mostly keep up with this, but remember that your CPU still has to do all kinds of other things to keep the computer running and since it’s not great at FLOPS, anything else it has to do will make video playback suffer.
This is where hardware accelerated video playback comes in. Offloading video playback to the GPU means your CPU is freed up to do the more menial tasks while the GPU can do what it does best – do some math to figure out how to display pixels on the screen. Actually just decompressing video and playing it back is a pretty easy task for the GPU. Even a 5 year old nVidia GTX470 has the power to playback multiple compressed HD streams in realtime and it’s only capable of 1 teraflop.
Modern video software is doing a lot more than just playing back video though, especially when it comes to certain effects in programs like Premiere and Resolve that are listed as hardware accelerated. Color correction, grading, transitions and 3D camera motion are all effects that send complex instruction sets to the GPU to transform the pixels of the original image. Some of these calculations are more complex than others and most projects will include several layers of effects, creating more and more complex instruction sets that need to be calculated in realtime. These calculations aren’t anywhere near as complex as those in your favourite 3D rendering software or the newest computer games, but when you start talking about editing 6K Dragon footage in realtime it can start to tax even the most powerful GPUs.
The more teraflops a computer has then, the more you can play with different stacked effects, masks, and simultaneous streams without fear of slowing things down. This means doing live colour grading on-set and less frustration in post, which can lead to a better product. If you've ever been up against a deadline, you know that sometimes you have to make decisions to deliver a less polished product to make sure you can render the final file fast enough, so the more powerful the GPU, the less you have to make those kinds of decisions.
Enter the Mac Pro supercomputer
Being capable of 7+ Teraflops definitely makes the Mac Pro able to play back smooth 4K video with complex colour correction and grading without breaking a sweat. The fact that Apple has put that much power into something that looks like a bathroom trash can is nothing short of amazing. At the very least it now allows us to focus on more important numbers that relate to how a workstation will be able to handle realtime video processing. Performance measured in FLOPS is no longer only important to big room sized super computers —-mainly because now we can buy a GPU off the shelf that has more floating point performance than existed in the world when HD video was first introduced.