TY - GEN
T1 - Matrix multiplication I/O-complexity by path routing
AU - Scott, Jacob
AU - Holtz, Olga
AU - Schwartz, Oded
N1 - Publisher Copyright: Copyright © 2015 ACM.
PY - 2015/6/13
Y1 - 2015/6/13
N2 - We apply a novel technique based on path routings to obtain optimal I/O-complexity lower bounds for all Strassenlike fast matrix multiplication algorithms computed in serial or in parallel, assuming no reuse of nontrivial intermediate linear combinations. Given fast memory of size M, we prove an I/O-complexity lower bound of Ω ((n/√)ω0. M) for any Strassen-like matrix multiplication algorithm applied to n × n matrices of arithmetic complexity Ω(nω0) with ω0 < 3 under this assumption. This generalizes an approach by Ballard, Demmel, Holtz, and Schwartz that provides a tight lower bound for Strassen's matrix multiplication algorithm but which does not apply to algorithms with disconnected encoding or decoding components of the underlying computation graph or algorithms with multiply copied values. We overcome these challenges via a new graphtheoretical approach for proving I/O-complexity lower bounds without the use of edge expansions.
AB - We apply a novel technique based on path routings to obtain optimal I/O-complexity lower bounds for all Strassenlike fast matrix multiplication algorithms computed in serial or in parallel, assuming no reuse of nontrivial intermediate linear combinations. Given fast memory of size M, we prove an I/O-complexity lower bound of Ω ((n/√)ω0. M) for any Strassen-like matrix multiplication algorithm applied to n × n matrices of arithmetic complexity Ω(nω0) with ω0 < 3 under this assumption. This generalizes an approach by Ballard, Demmel, Holtz, and Schwartz that provides a tight lower bound for Strassen's matrix multiplication algorithm but which does not apply to algorithms with disconnected encoding or decoding components of the underlying computation graph or algorithms with multiply copied values. We overcome these challenges via a new graphtheoretical approach for proving I/O-complexity lower bounds without the use of edge expansions.
KW - Communication-avoiding algorithms
KW - Fast matrix multiplication
KW - I/O-complexity
UR - http://www.scopus.com/inward/record.url?scp=84950303470&partnerID=8YFLogxK
U2 - 10.1145/2755573.2755594
DO - 10.1145/2755573.2755594
M3 - منشور من مؤتمر
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 35
EP - 45
BT - SPAA 2015 - Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures
T2 - 27th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2015
Y2 - 13 June 2015 through 15 June 2015
ER -