Optimizing GPU virtualization with address mapping and delayed submission

Document Type

Conference Proceeding

Publication Date



© 2014 IEEE. The state-of-the-art GPU virtualization framework, gVirtuS, relies on an API remoting mechanism to set up a communication channel between a virtual machine and the host, so that a CUDA application in a virtual machine can be executed 'remotely' in the host. We observe that this API remoting mechanism often involves large-volume and frequent data transmissions between the host OS and the guest OS, which lead to a significant performance degradation. We present an address mapping scheme so the host can directly access the machine memory space of the guest and thus avoid data copying between the guest and the host. To reduce the frequency of data transmissions, we introduce a delayed submission scheme. We implement both address mapping and delayed submission in KVM. Our evaluation on a set of CUDA benchmarks shows that address mapping can improve over the original gVirtuS by up to 6.5 times. Delayed submission is able to further reduce the virtualization overhead by half in a pathological case.

Publication Title

Proceedings - 16th IEEE International Conference on High Performance Computing and Communications, HPCC 2014, 11th IEEE International Conference on Embedded Software and Systems, ICESS 2014 and 6th International Symposium on Cyberspace Safety and Security, CSS 2014