TY - GEN
T1 - GPUpIO
T2 - 9th Annual Workshop on General Purpose Processing Using Graphics Processing Unit, GPGPU 2016
AU - Zeno, Lior
AU - Mendelson, Avi
AU - Silberstein, Mark
N1 - Publisher Copyright: © 2016 ACM.
PY - 2016/3/12
Y1 - 2016/3/12
N2 - As GPUs become general purpose, they are outgrowing the coprocessor model and require convenient I/O abstractions such as files and network sockets. Recent studies have shown the benefits of native GPU I/O layers, in terms of both programmability and performance. However, due to lack of hardware support, the GPU threads performing I/O calls are forced to busy-wait for the completion of I/O operations, resulting in underutilized hardware, higher power consumption, and reduced system throughput. We argue that I/O-driven preemption improves the performance of existing solutions, despite many challenging system characteristics such as a large kernel state. We analyze the benefits of adding preemption support using a simple system performance model, and, encouraged by the results, explore the design of a software-based preemption mechanism for GPUs. In our prototype, GPUpIO, we implement a source-to-source compiler for state checkpoint and restoration, and a runtime library for scheduling preempted threadblocks, which together enable I/O-driven preemption for GPUs. We evaluate our prototype across a variety of system parameters and workloads to determine when preemption is worthwhile. We show that in some workloads the I/O-driven preemption approach may indeed double the effective system throughput by completely hiding the I/O latency behind computations. However, we also observe that the software-only solution is currently limited, not only due to its overheads, but also because it does not have sufficient control of the hardware scheduler queue and therefore may lead to starvation of I/O kernels. We then discuss a new hardware feature that, if added, may render a general I/O-driven preemption mechanism on GPUs practical.
AB - As GPUs become general purpose, they are outgrowing the coprocessor model and require convenient I/O abstractions such as files and network sockets. Recent studies have shown the benefits of native GPU I/O layers, in terms of both programmability and performance. However, due to lack of hardware support, the GPU threads performing I/O calls are forced to busy-wait for the completion of I/O operations, resulting in underutilized hardware, higher power consumption, and reduced system throughput. We argue that I/O-driven preemption improves the performance of existing solutions, despite many challenging system characteristics such as a large kernel state. We analyze the benefits of adding preemption support using a simple system performance model, and, encouraged by the results, explore the design of a software-based preemption mechanism for GPUs. In our prototype, GPUpIO, we implement a source-to-source compiler for state checkpoint and restoration, and a runtime library for scheduling preempted threadblocks, which together enable I/O-driven preemption for GPUs. We evaluate our prototype across a variety of system parameters and workloads to determine when preemption is worthwhile. We show that in some workloads the I/O-driven preemption approach may indeed double the effective system throughput by completely hiding the I/O latency behind computations. However, we also observe that the software-only solution is currently limited, not only due to its overheads, but also because it does not have sufficient control of the hardware scheduler queue and therefore may lead to starvation of I/O kernels. We then discuss a new hardware feature that, if added, may render a general I/O-driven preemption mechanism on GPUs practical.
KW - Accelerators
KW - File systems
KW - GPGPUs
KW - Operating systems design
KW - Source-to- source compiliation
UR - http://www.scopus.com/inward/record.url?scp=84966522121&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/2884045.2884053
DO - https://doi.org/10.1145/2884045.2884053
M3 - منشور من مؤتمر
T3 - 9th Workshop on General Purpose Processing using GPUs, GPGPU 2016 - Proceedings
SP - 63
EP - 71
BT - 9th Workshop on General Purpose Processing using GPUs, GPGPU 2016 - Proceedings
A2 - Sun, Yifan
Y2 - 12 March 2016
ER -