Abstract
We describe how we evolve Omid, a transaction processing system for Apache HBase, to power Apache Phoenix, a cloud-grade real-time SQL analytics engine. Omid was originally designed for data processing pipelines at Yahoo, which are, by and large, throughput-oriented monolithic NoSQL applications. Providing a platform to support converged real-time transaction processing and analytics applications - dubbed translytics - introduces new functional and performance requirements. For example, SQL support is key for developer productivity, multi-tenancy is essential for cloud deployment, and latency is cardinal for just-in-time data ingestion and analytics insights. We discuss our efforts to adapt Omid to these new domains, as part of the process of integrating it into Phoenix as the transaction processing backend. A central piece of our work is latency reduction in Omid's protocol, which also improves scalability. Under light load, the new protocol's latency is 4x to 5x smaller than the legacy Omid's, whereas under increased loads it is an order of magnitude faster. We further describe a fast path protocol for single-key transactions, which enables processing them almost as fast as native HBase operations.
Original language | English |
---|---|
Pages (from-to) | 1795-1808 |
Number of pages | 14 |
Journal | Proceedings of the VLDB Endowment |
Volume | 11 |
Issue number | 12 |
DOIs | |
State | Published - Aug 2018 |
Event | 44th International Conference on Very Large Data Bases, VLDB 2018 - Rio de Janeiro, Brazil Duration: 27 Aug 2017 → 31 Aug 2017 |
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
- General Computer Science