Rakesh Malik delves into the history of virtual machines and how new solutions, such as Nvidia GRID 2.0, could impact workflows and post houses.
Nvidia recently announced GRID 2.0, a new take on an idea that's been pretty well established in supercomputing for decades. To understand what GRID 2.0 does and what it offers, we first need to understand what virtualization is.
Java made the idea of the virtual machine famous. The Java compiler compiled Java code into bytecodes, a form of machine language made for the Java virtual machine, or JVM. The JVM would load the bytecodes and compile them into machine language specific to the computer it was running on. Being a software implementation, it had a lot of overhead, so for many years Java performance wasn't in the ballpark of native software performance. Although it was enough for web and database applications that didn't require heavy duty computing, it couldn't take advantage of hardware like GPUs. Later, Microsoft's .NET platform expanded on this idea by compiling the intermediate language down to machine code. The platform also enabled developers to implement optimized libraries compiled directly to machine code, using a trusted code model to help prevent malicious code.
Nearly a decade ago, Intel implemented virtualization in hardware, taking a page out of the book of other high-end server platforms like HP's PA-RISC and IBM's POWER, which added buffering, switching and other services to the processor, essentially providing a virtual copy of the processor itself to an operating system. The virtualization transparently time sliced entire operating systems, making it possible for an ordinary personal computer to run multiple operating systems at the same time. Intel demonstrated this technology by setting up a system running a Windows instance and a Linux instance at the same time. The Linux user was able to reboot the computer without affecting the Windows user.
It's important to understand the difference between virtualization and emulation. Emulation enables software writing for one computing platform to run on another, for example running Windows x86 applications on a PowerPC based Mac. Virtualization doesn't do this. It allows multiple complete operating system instances to run on the same computer, but they still have to be compiled for the platform, as is the case with OSX and Windows 7/8/10 today. Virtualization allows instances of both operating systems to run concurrently, each being allocated computing resources as needed.
Back in the day when IBM was a powerhouse in supercomputing, it developed a technology called Scalable Power. This was a virtual machine technology, but applied at a larger scale. The hardware side consisted of a set of rack-mount Power workstations, with a proprietary switched interconnect organized in a hierarchy. A node could send a message to another node on the same rack with one network hop, to other nodes in its group with two network hops and to nodes in another group with three network hops.
This enabled the computing platform to scale to pretty much as many processors as one was willing to pay for and, since there was a maximum of three hops to send messages between nodes, it was possible to send data from one node to another quickly and with a fixed and known maximum wait time for delivering messages from one process to another.
When deploying an application, a developer could select a set of nodes via an admin console. Pick a set of 32 nodes, deploy your application and it would run as if it were on a 32-processor computing cluster. The SP2 cluster could be hosting several such applications, each one running as if it had an entire cluster to itself.
Shared Desktop Infrastructure
Nvidia's new twist on these two concepts is to implement this at the GPU level. Nvidia has added hardware support for the same sort of virtual machine that Intel demonstrated at IDF, in addition to partitioning similar to what IBM offered with Scalable Power.
The hardware side of GRID 2.0 consists of boards that sport two or four Nvidia GPUs that IT staff can deploy in a datacenter. The Nvidia GPUs have hardware virtualization support, so the GPU can be partitioned into several 'sub-GPUs' for applications to work with. How much of the GPU an application uses is configurable. For standard users, such as software developers whose tasks involve writing, compiling and debugging applications, the administrator can allocate them a small slice of the GPU, since such applications don't generally use the GPU heavily. At the same time, users that are doing more GPU intensive tasks like CAD can get larger slices of the GPU. It's also possible to allocate an entire GPU for a particularly compute or GPU heavy task, offering a wide range of flexibility.
Since GPUs have a large number of processors that are geared toward computing tasks, Nvidia's GRID 2.0 enables an extremely high compute density. Where IBM's SP2 required several racks in order to support a thousand or more processors, with a 4-GPU GRID board, it's now possible with a just a few PCIExpress cards. On top of higher compute density, interconnects are far faster now, so the processors can send each other messages and get data to and from main memory more quickly as well.
As mobile computing grows, the 'Bring Your Own Device' situation is becoming more and more common. One benefit of using a technology like GRID 2.0 is that it allows users access to their applications via their personal devices, while keeping the data and application on the corporate intranet. This makes securing corporate data easier for IT staff, but it has another major benefit, which is to allow remote users to work with large data sets without needing to spend the time transferring the vast amount of data across the network.
With GRID 2.0, it's possible to run a GPU-compute application on a very high-end, shared GPU while interacting with it using pretty much any network connected device. The only data transferred over the intervening network is the rendered pixel data, or in other words, the user interface.