TY - GEN
T1 - Pay-as-you-go reconciliation in schema matching networks
AU - Nguyen, Quoc Viet Hung
AU - Nguyen, Thanh Tam
AU - Miklós, Zoltán
AU - Aberer, Karl
AU - Gal, Avigdor
AU - Weidlich, Matthias
PY - 2014
Y1 - 2014
N2 - Schema matching is the process of establishing correspondences between the attributes of database schemas for data integration purposes. Although several automatic schema matching tools have been developed, their results are often incomplete or erroneous. To obtain a correct set of correspondences, a human expert is usually required to validate the generated correspondences. We analyze this reconciliation process in a setting where a number of schemas needs to be matched, in the presence of consistency expectations about the network of attribute correspondences. We develop a probabilistic model that helps to identify the most uncertain correspondences, thus allowing us to guide the expert's work and collect his input about the most problematic cases. As the availability of such experts is often limited, we develop techniques that can construct a set of good quality correspondences with a high probability, even if the expert does not validate all the necessary correspondences. We demonstrate the efficiency of our techniques through extensive experimentation using real-world datasets.
AB - Schema matching is the process of establishing correspondences between the attributes of database schemas for data integration purposes. Although several automatic schema matching tools have been developed, their results are often incomplete or erroneous. To obtain a correct set of correspondences, a human expert is usually required to validate the generated correspondences. We analyze this reconciliation process in a setting where a number of schemas needs to be matched, in the presence of consistency expectations about the network of attribute correspondences. We develop a probabilistic model that helps to identify the most uncertain correspondences, thus allowing us to guide the expert's work and collect his input about the most problematic cases. As the availability of such experts is often limited, we develop techniques that can construct a set of good quality correspondences with a high probability, even if the expert does not validate all the necessary correspondences. We demonstrate the efficiency of our techniques through extensive experimentation using real-world datasets.
UR - http://www.scopus.com/inward/record.url?scp=84901792048&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/ICDE.2014.6816653
DO - https://doi.org/10.1109/ICDE.2014.6816653
M3 - منشور من مؤتمر
SN - 9781479925544
T3 - Proceedings - International Conference on Data Engineering
SP - 220
EP - 231
BT - 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014
T2 - 30th IEEE International Conference on Data Engineering, ICDE 2014
Y2 - 31 March 2014 through 4 April 2014
ER -