TY - GEN
T1 - A Formal Language Perspective on Factorized Representations
AU - Kimelfeld, Benny
AU - Martens, Wim
AU - Niewerth, Matthias
N1 - Publisher Copyright: © Benny Kimelfeld, Wim Martens, and Matthias Niewerth.
PY - 2025/3/21
Y1 - 2025/3/21
N2 - Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish several new connections. First, unnamed FRs can be exponentially more succinct than named FRs, but this difference can be alleviated by imposing a disjointness condition on columns. Conversely, named FRs can also be exponentially more succinct than unnamed FRs. Second, unnamed FRs are the same as (i.e., isomorphic to) context-free grammars for languages in which each word has the same length. This tight connection allows us to transfer a wide range of results on context-free grammars to database factorization; of which we offer a selection in the paper. Third, when we generalize unnamed FRs to arbitrary sets of tuples, they become a generalization of path multiset representations, a formalism that was recently introduced to succinctly represent sets of paths in the context of graph database query evaluation.
AB - Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish several new connections. First, unnamed FRs can be exponentially more succinct than named FRs, but this difference can be alleviated by imposing a disjointness condition on columns. Conversely, named FRs can also be exponentially more succinct than unnamed FRs. Second, unnamed FRs are the same as (i.e., isomorphic to) context-free grammars for languages in which each word has the same length. This tight connection allows us to transfer a wide range of results on context-free grammars to database factorization; of which we offer a selection in the paper. Third, when we generalize unnamed FRs to arbitrary sets of tuples, they become a generalization of path multiset representations, a formalism that was recently introduced to succinctly represent sets of paths in the context of graph database query evaluation.
KW - compact representations
KW - Databases
KW - factorized databases
KW - graph databases
KW - regular path queries
KW - relational databases
UR - http://www.scopus.com/inward/record.url?scp=105001565265&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICDT.2025.20
DO - 10.4230/LIPIcs.ICDT.2025.20
M3 - منشور من مؤتمر
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 28th International Conference on Database Theory, ICDT 2025
A2 - Roy, Sudeepa
A2 - Kara, Ahmet
T2 - 28th International Conference on Database Theory, ICDT 2025
Y2 - 25 March 2025 through 28 March 2025
ER -