Probabilistic XML: Models and complexity

Benny Kimelfeld, Pierre Senellart

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Uncertainty in data naturally arises in various applications, such as data integration and Web information extraction. Probabilistic XML is one of the concepts that have been proposed to model and manage various kinds of uncertain data. In essence, a probabilistic XML document is a compact representation of a probability distribution over ordinary XML documents. Various models of probabilistic XML provide different languages, with various degrees of expressiveness, for such compact representations. Beyond representation, probabilistic XML systems are expected to support data management in a way that properly reflects the uncertainty. For instance, query evaluation entails probabilistic inference, and update operations need to properly change the entire probability space. Efficiently and effectively accomplishing data-management tasks in that manner is a major technical challenge. This chapter reviews the literature on probabilistic XML. Specifically, this chapter discusses the probabilistic XML models that have been proposed, and the complexity of query evaluation therein. Also discussed are other data-management tasks like updates and compression, as well as systemic and implementation aspects.

Original languageEnglish
Title of host publicationAdvances in Probabilistic Databases for Uncertain Information Management
Pages39-66
Number of pages28
DOIs
StatePublished - 2013
Externally publishedYes

Publication series

NameStudies in Fuzziness and Soft Computing
Volume304

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Probabilistic XML: Models and complexity'. Together they form a unique fingerprint.

Cite this