The Importance of Parameters in Database Queries

Martin Grohe, Benny Kimelfeld, Peter Lindner, Christoph Standke

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We propose and study a framework for quantifying the importance of the choices of parameter values to the result of a query over a database. These parameters occur as constants in logical queries, such as conjunctive queries. In our framework, the importance of a parameter is its Shap score. This score is a popular instantiation of the game-theoretic Shapley value to measuring the importance of feature values in machine learning models. We make the case for the rationale of using this score by explaining the intuition behind Shap, and by showing that we arrive at this score in two different, apparently opposing, approaches to quantifying the contribution of a parameter. The application of the Shap score requires two components in addition to the query and the database: (a) a probability distribution over the combinations of parameter values, and (b) a utility function that measures the similarity between the result for the original parameters and the result for hypothetical parameters. The main question addressed in the paper is the complexity of calculating the Shap score for different distributions and similarity measures. We first address the case of probabilistically independent parameters. The problem is hard if we consider a fragment of queries that is hard to evaluate (as one would expect), and even for the fragment of acyclic conjunctive queries. In some cases, though, one can efficiently list all relevant parameter combinations, and then the Shap score can be computed in polynomial time under reasonable general conditions. Also tractable is the case of full acyclic conjunctive queries for certain (natural) similarity functions. We extend our results to conjunctive queries with inequalities between variables and parameters. Finally, we discuss a simple approximation technique for the case of correlated parameters.

Original languageEnglish
Title of host publication27th International Conference on Database Theory, ICDT 2024
EditorsGraham Cormode, Michael Shekelyan
ISBN (Electronic)9783959773126
DOIs
StatePublished - Mar 2024
Event27th International Conference on Database Theory, ICDT 2024 - Paestum, Italy
Duration: 25 Mar 202428 Mar 2024

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume290

Conference

Conference27th International Conference on Database Theory, ICDT 2024
Country/TerritoryItaly
CityPaestum
Period25/03/2428/03/24

Keywords

  • query parameters
  • SHAP score
  • Shapley value

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'The Importance of Parameters in Database Queries'. Together they form a unique fingerprint.

Cite this