TY - GEN
T1 - Applying sorting networks to synthesize optimized sorting libraries
AU - Codish, Michael
AU - Cruz-Filipe, Luís
AU - Nebel, Markus
AU - Schneider-Kamp, Peter
N1 - Publisher Copyright: © Springer International Publishing Switzerland 2015.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - This paper presents an application of the theory of sorting networks to facilitate the synthesis of optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. This enables further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that, if considering the number of comparisons and swaps, the theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. We provide empirical evidence that using code synthesized from efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.
AB - This paper presents an application of the theory of sorting networks to facilitate the synthesis of optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. This enables further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that, if considering the number of comparisons and swaps, the theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. We provide empirical evidence that using code synthesized from efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.
UR - http://www.scopus.com/inward/record.url?scp=84952801978&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/978-3-319-27436-2_8
DO - https://doi.org/10.1007/978-3-319-27436-2_8
M3 - Conference contribution
SN - 9783319274355
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 127
EP - 142
BT - Logic-Based Program Synthesis and Transformation - 25th International Symposium, LOPSTR 2015, Revised Selected Papers
A2 - Falaschi, Moreno
PB - Springer Verlag
T2 - 25th International Symposium on Logic-Based Program Synthesis and Transformation, LOPSTR 2015
Y2 - 13 July 2015 through 15 July 2015
ER -