Abstract
The rapidly-growing organizational data resources introduce a growing difficulty to locate and understand the relevant data subsets within large datasets - what can be seen as a severe information quality issue in today's decision-support environments. The study proposes a quantitative methodology, based on the mutual-information metric, for assessing the relative importance of different data subsets within a large dataset. Such assessments can grant the end-user with faster access to relevant subsets within a large dataset, the ability to better understandits contents, and gain deeper insights from analyzing it - e.g., when such a dataset is being used for Business Intelligence (BI) applications. This manuscript provides the background and the motivation for integrating the proposed assessments of relative importance. It then defines the calculations behind the mutual-information metric, and demonstrates their applications using illustrative examples.
Original language | American English |
---|---|
State | Published - 1 Dec 2011 |
Event | 19th European Conference on Information Systems - ICT and Sustainable Service Development, ECIS 2011 - Helsinki, Finland Duration: 9 Jun 2011 → 11 Jun 2011 |
Conference
Conference | 19th European Conference on Information Systems - ICT and Sustainable Service Development, ECIS 2011 |
---|---|
Country/Territory | Finland |
City | Helsinki |
Period | 9/06/11 → 11/06/11 |
Keywords
- Business Intelligence (BI)
- Data mining
- Data warehouse
- Mutual information
- On-line analytical processing (OLAP)
All Science Journal Classification (ASJC) codes
- Information Systems