Finding rare numerical stability errors in concurrent computations

Hana Chockler, Karine Even, Eran Yahav

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A numerical algorithm is called stable if an error, in all possible executions of the algorithm, does not exceed a predefined bound. Introduction of concurrency to numerical algorithms results in a significant increase in the number of possible computations of the same result, due to different possible interleavings of concurrent threads. This can lead to instability of previously stable algorithms, since rounding can result in a larger error than expected for some interleavings. Such errors can be very rare, since the particular combination of rounding can occur in only a small fraction of interleavings. In this paper, we apply the cross-entropy method - a generic approach to rare event simulation and combinatorial optimization - to detect rare numerical instability in concurrent programs. The cross-entropy method iteratively samples a small number of executions and adjusts the probability distribution of possible scheduling decisions to increase the probability of encountering an error in a subsequent iteration. We demonstrate the effectiveness of our approach on implementations of several numerical algorithms with concurrency and rounding by truncation of intermediate computations. We describe several abstraction algorithms on top of the implementation of the cross-entropy method and show that with abstraction, our algorithms successfully find rare errors in programs with hundreds of threads. In fact, some of our abstractions lead to a state space whose size does not depend on the number of threads at all. We compare our approach to several existing testing algorithms and argue that its performance is superior to other techniques.

Original languageEnglish
Title of host publication2013 International Symposium on Software Testing and Analysis, ISSTA 2013 - Proceedings
Pages12-22
Number of pages11
DOIs
StatePublished - 2013
Event22nd International Symposium on Software Testing and Analysis, ISSTA 2013 - Lugano, Switzerland
Duration: 15 Jul 201320 Jul 2013

Publication series

Name2013 International Symposium on Software Testing and Analysis, ISSTA 2013 - Proceedings

Conference

Conference22nd International Symposium on Software Testing and Analysis, ISSTA 2013
Country/TerritorySwitzerland
CityLugano
Period15/07/1320/07/13

Keywords

  • cross-entropy
  • numerical stability
  • testing

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'Finding rare numerical stability errors in concurrent computations'. Together they form a unique fingerprint.

Cite this