TY - JOUR
T1 - REMI: A framework of reusable elements for mining heterogeneous data with missing information: A Tale of Congestion in Two Smart Cities
T2 - A framework of reusable elements for mining heterogeneous data with missing information: A Tale of Congestion in Two Smart Cities
AU - Gal, Avigdor
AU - Gunopulos, Dimitrios
AU - Panagiotou, Nikolaos
AU - Rivetti, Nicolo
AU - Senderovich, Arik
AU - Zygouras, Nikolas
N1 - Publisher Copyright: © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2018/10/1
Y1 - 2018/10/1
N2 - Applications targeting smart cities tackle common challenges, however solutions are seldom portable from one city to another due to the heterogeneity of smart city ecosystems. A major obstacle involves the differences in the levels of available information. In this work, we present REMI, which is a mining framework that handles varying degrees of information availability by providing a meta-solution to missing data. The framework core concept is the REMI layered stack architecture, offering two complementary approaches to dealing with missing information, namely data enrichment (DARE) and graceful degradation (GRADE). DARE aims at inference of missing information levels, while GRADE attempts to mine the patterns using only the existing data.We show that REMI provides multiple ways for re-usability, while being fault tolerant and enabling incremental development. One may apply the architecture to different problem instantiations within the same domain, or deploy it across various domains. Furthermore, we introduce the other three components of the REMI framework backing the layered stack. To support decision making in this framework, we show a mapping of REMI into an optimization problem (OTP) that balances the trade-off between three costs: inaccuracies in inference of missing data (DARE), errors when using less information (GRADE), and gathering of additional data. Further, we provide an experimental evaluation of REMI using real-world transportation data coming from two European smart cities, namely Dublin and Warsaw.
AB - Applications targeting smart cities tackle common challenges, however solutions are seldom portable from one city to another due to the heterogeneity of smart city ecosystems. A major obstacle involves the differences in the levels of available information. In this work, we present REMI, which is a mining framework that handles varying degrees of information availability by providing a meta-solution to missing data. The framework core concept is the REMI layered stack architecture, offering two complementary approaches to dealing with missing information, namely data enrichment (DARE) and graceful degradation (GRADE). DARE aims at inference of missing information levels, while GRADE attempts to mine the patterns using only the existing data.We show that REMI provides multiple ways for re-usability, while being fault tolerant and enabling incremental development. One may apply the architecture to different problem instantiations within the same domain, or deploy it across various domains. Furthermore, we introduce the other three components of the REMI framework backing the layered stack. To support decision making in this framework, we show a mapping of REMI into an optimization problem (OTP) that balances the trade-off between three costs: inaccuracies in inference of missing data (DARE), errors when using less information (GRADE), and gathering of additional data. Further, we provide an experimental evaluation of REMI using real-world transportation data coming from two European smart cities, namely Dublin and Warsaw.
KW - Complex patterns
KW - Enrichment
KW - Graceful degradation
KW - Mining
KW - Missing information
KW - Reusable elements
UR - http://www.scopus.com/inward/record.url?scp=85051707727&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/s10844-018-0524-5
DO - https://doi.org/10.1007/s10844-018-0524-5
M3 - مقالة
SN - 0925-9902
VL - 51
SP - 367
EP - 388
JO - Journal of Intelligent Information Systems
JF - Journal of Intelligent Information Systems
IS - 2
ER -