TY - GEN
T1 - Yahoo! music recommendations
T2 - 5th ACM Conference on Recommender Systems, RecSys 2011
AU - Dror, Gideon
AU - Koenigstein, Noam
AU - Koren, Yehuda
PY - 2011
Y1 - 2011
N2 - In the past decade large scale recommendation datasets were published and extensively studied. In this work we describe a detailed analysis of a sparse, large scale dataset, specifically designed to push the envelope of recommender system models. The Yahoo! Music dataset consists of more than a million users, 600 thousand musical items and more than 250 million ratings, collected over a decade. It is characterized by three unique features: First, rated items are multi-typed, including tracks, albums, artists and genres; Second, items are arranged within a four level taxonomy, proving itself effective in coping with a severe sparsity problem that originates from the unusually large number of items (compared to, e.g., movie ratings datasets). Finally, fine resolution timestamps associated with the ratings enable a comprehensive temporal and session analysis. We further present a matrix factorization model exploiting the special characteristics of this dataset. In particular, the model incorporates a rich bias model with terms that capture information from the taxonomy of items and different temporal dynamics of music ratings. To gain additional insights of its properties, we organized the KddCup-2011 competition about this dataset. As the competition drew thousands of participants, we expect the dataset to attract considerable research activity in the future.
AB - In the past decade large scale recommendation datasets were published and extensively studied. In this work we describe a detailed analysis of a sparse, large scale dataset, specifically designed to push the envelope of recommender system models. The Yahoo! Music dataset consists of more than a million users, 600 thousand musical items and more than 250 million ratings, collected over a decade. It is characterized by three unique features: First, rated items are multi-typed, including tracks, albums, artists and genres; Second, items are arranged within a four level taxonomy, proving itself effective in coping with a severe sparsity problem that originates from the unusually large number of items (compared to, e.g., movie ratings datasets). Finally, fine resolution timestamps associated with the ratings enable a comprehensive temporal and session analysis. We further present a matrix factorization model exploiting the special characteristics of this dataset. In particular, the model incorporates a rich bias model with terms that capture information from the taxonomy of items and different temporal dynamics of music ratings. To gain additional insights of its properties, we organized the KddCup-2011 competition about this dataset. As the competition drew thousands of participants, we expect the dataset to attract considerable research activity in the future.
KW - Yahoo! music
KW - collaborative filtering
KW - matrix factorization
KW - recommender systems
UR - http://www.scopus.com/inward/record.url?scp=82555183093&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/2043932.2043964
DO - https://doi.org/10.1145/2043932.2043964
M3 - منشور من مؤتمر
SN - 9781450306836
T3 - RecSys'11 - Proceedings of the 5th ACM Conference on Recommender Systems
SP - 165
EP - 172
BT - RecSys'11 - Proceedings of the 5th ACM Conference on Recommender Systems
Y2 - 23 October 2011 through 27 October 2011
ER -