TY - GEN
T1 - Answering regular path queries on workflow provenance
AU - Huang, Xiaocheng
AU - Bao, Zhuowei
AU - Davidson, Susan B.
AU - Milo, Tova
AU - Yuan, Xiaojie
N1 - Publisher Copyright: © 2015 IEEE.
PY - 2015/5/26
Y1 - 2015/5/26
N2 - This paper proposes a novel approach for efficiently evaluating regular path queries over provenance graphs of workflows that may include recursion. The approach assumes that an execution g of a workflow G is labeled with query-agnostic reachability labels using an existing technique. At query time, given g, G and a regular path query R, the approach decomposes R into a set of subqueries R1,..., Rk that are safe for G. For each safe subquery Ri, G is rewritten so that, using the reachability labels of nodes in g, whether or not there is a path which matches Ri between two nodes can be decided in constant time. The results of each safe subquery are then composed, possibly with some small unsafe remainder, to produce an answer to R. The approach results in an algorithm that significantly reduces the number of subqueries k over existing techniques by increasing their size and complexity, and that evaluates each subquery in time bounded by its input and output size. Experimental results demonstrate the benefit of this approach.
AB - This paper proposes a novel approach for efficiently evaluating regular path queries over provenance graphs of workflows that may include recursion. The approach assumes that an execution g of a workflow G is labeled with query-agnostic reachability labels using an existing technique. At query time, given g, G and a regular path query R, the approach decomposes R into a set of subqueries R1,..., Rk that are safe for G. For each safe subquery Ri, G is rewritten so that, using the reachability labels of nodes in g, whether or not there is a path which matches Ri between two nodes can be decided in constant time. The results of each safe subquery are then composed, possibly with some small unsafe remainder, to produce an answer to R. The approach results in an algorithm that significantly reduces the number of subqueries k over existing techniques by increasing their size and complexity, and that evaluates each subquery in time bounded by its input and output size. Experimental results demonstrate the benefit of this approach.
UR - http://www.scopus.com/inward/record.url?scp=84940845000&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2015.7113299
DO - 10.1109/ICDE.2015.7113299
M3 - منشور من مؤتمر
T3 - Proceedings - International Conference on Data Engineering
SP - 375
EP - 386
BT - 2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
PB - IEEE Computer Society
T2 - 2015 31st IEEE International Conference on Data Engineering, ICDE 2015
Y2 - 13 April 2015 through 17 April 2015
ER -