TY - GEN
T1 - Fault-Tolerant Parallel Integer Multiplication
AU - Nissim, Roy
AU - Schwartz, Oded
AU - Spiizer, Yuval
N1 - Publisher Copyright: © 2024 Owner/Author.
PY - 2024/6/17
Y1 - 2024/6/17
N2 - Exascale machines have a small mean time between failures, necessitating fault tolerance. Out-of-the-box fault-tolerant solutions, such as checkpoint-restart and replication, apply to any algorithm but incur significant overhead costs. Long integer multiplication is a fundamental kernel in numerical linear algebra and cryptography. The na ve, schoolbook multiplication algorithm runs inΘ ( n2k while Toom-Cook algorithms runs in Θ ( nlogκ (2κ-1) for 2 ≤ κ. We obtain the first efficient fault-tolerant parallel Toom-Cook algorithm. While asymptotically faster FFT-based algorithms exist, Toom-Cook algorithms are often favored in practice on small scale and on supercomputers. Our algorithm enables fault tolerance with negligible overhead costs. Compared to existing, general-purpose, faulttolerant solutions, our algorithm reduces the arithmetic and communication (bandwidth) overhead costs by a factor of Θ P (2κ-1) (where P is the number of processors). To this end, we adapt the fault-tolerant BFS-DFS method of Birnbaum et al. (2020) for fast matrix multiplication and combine it with a coding strategy tailored for Toom-Cook. This eliminates the need for recomputations, resulting in a much faster algorithm..
AB - Exascale machines have a small mean time between failures, necessitating fault tolerance. Out-of-the-box fault-tolerant solutions, such as checkpoint-restart and replication, apply to any algorithm but incur significant overhead costs. Long integer multiplication is a fundamental kernel in numerical linear algebra and cryptography. The na ve, schoolbook multiplication algorithm runs inΘ ( n2k while Toom-Cook algorithms runs in Θ ( nlogκ (2κ-1) for 2 ≤ κ. We obtain the first efficient fault-tolerant parallel Toom-Cook algorithm. While asymptotically faster FFT-based algorithms exist, Toom-Cook algorithms are often favored in practice on small scale and on supercomputers. Our algorithm enables fault tolerance with negligible overhead costs. Compared to existing, general-purpose, faulttolerant solutions, our algorithm reduces the arithmetic and communication (bandwidth) overhead costs by a factor of Θ P (2κ-1) (where P is the number of processors). To this end, we adapt the fault-tolerant BFS-DFS method of Birnbaum et al. (2020) for fast matrix multiplication and combine it with a coding strategy tailored for Toom-Cook. This eliminates the need for recomputations, resulting in a much faster algorithm..
KW - fault tolerance
KW - i/o complexity
KW - long integer multiplication
KW - parallel computing
KW - toom-cook
UR - http://www.scopus.com/inward/record.url?scp=85197451035&partnerID=8YFLogxK
U2 - 10.1145/3626183.3659961
DO - 10.1145/3626183.3659961
M3 - منشور من مؤتمر
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 207
EP - 218
BT - SPAA 2024 - Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures
T2 - 36th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2024
Y2 - 17 June 2024 through 21 June 2024
ER -