TY - GEN
T1 - Entity matching in online social networks
AU - Peled, Olga
AU - Fire, Michael
AU - Rokach, Lior
AU - Elovici, Yuval
PY - 2013/12/1
Y1 - 2013/12/1
N2 - In recent years, Online Social Networks (OSNs) have essentially become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus and offers for particular services and functionalities. To take advantage of the full range of services and functionalities that OSNs offer, users often create several accounts on various OSNs using the same or different personal information. Retrieving all available data about an individual from several OSNs and merging it into one profile can be useful for many purposes. In this paper, we present a method for solving the Entity Resolution (ER), problem for matching user profiles across multiple OSNs. Our algorithm is able to match two user profiles from two different OSNs based on machine learning techniques, which uses features extracted from each one of the user profiles. Using supervised learning techniques and extracted features, we constructed different classifiers, which were then trained and used to rank the probability that two user profiles from two different OSNs belong to the same individual. These classifiers utilized 27 features of mainly three types: name based features (i.e., the Soundex value of two names), general user info based features (i.e., the cosine similarity between two user profiles), and social network topological based features (i.e., the number of mutual friends between two users' friends list). This experimental study uses real-life data collected from two popular OSNs, Facebook and Xing. The proposed algorithm was evaluated and its classification performance measured by AUC was 0.982 in identifying user profiles across two OSNs.
AB - In recent years, Online Social Networks (OSNs) have essentially become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus and offers for particular services and functionalities. To take advantage of the full range of services and functionalities that OSNs offer, users often create several accounts on various OSNs using the same or different personal information. Retrieving all available data about an individual from several OSNs and merging it into one profile can be useful for many purposes. In this paper, we present a method for solving the Entity Resolution (ER), problem for matching user profiles across multiple OSNs. Our algorithm is able to match two user profiles from two different OSNs based on machine learning techniques, which uses features extracted from each one of the user profiles. Using supervised learning techniques and extracted features, we constructed different classifiers, which were then trained and used to rank the probability that two user profiles from two different OSNs belong to the same individual. These classifiers utilized 27 features of mainly three types: name based features (i.e., the Soundex value of two names), general user info based features (i.e., the cosine similarity between two user profiles), and social network topological based features (i.e., the number of mutual friends between two users' friends list). This experimental study uses real-life data collected from two popular OSNs, Facebook and Xing. The proposed algorithm was evaluated and its classification performance measured by AUC was 0.982 in identifying user profiles across two OSNs.
KW - Entity resolution
KW - Machine learning
KW - Online Social Networks
UR - http://www.scopus.com/inward/record.url?scp=84893634279&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/SocialCom.2013.53
DO - https://doi.org/10.1109/SocialCom.2013.53
M3 - Conference contribution
SN - 9780769551371
T3 - Proceedings - SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013
SP - 339
EP - 344
BT - Proceedings - SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013
T2 - 2013 ASE/IEEE Int. Conf. on Social Computing, SocialCom 2013, the 2013 ASE/IEEE Int. Conf. on Big Data, BigData 2013, the 2013 Int. Conf. on Economic Computing, EconCom 2013, the 2013 PASSAT 2013, and the 2013 ASE/IEEE Int. Conf. on BioMedCom 2013
Y2 - 8 September 2013 through 14 September 2013
ER -