TY - GEN
T1 - On composition of a federated web search result page
T2 - 4th ACM International Conference on Web Search and Data Mining, WSDM 2011
AU - Kumar Ponnuswami, Ashok
AU - Pattabiraman, Kumaresh
AU - Wu, Qiang
AU - Gilad-Bachrach, Ran
AU - Kanungo, Tapas
PY - 2011
Y1 - 2011
N2 - Modern web search engines are federated -a user query is sent to the numerous specialized search engines called verticals like web (text documents), News, Image, Video, etc. and the results returned by these engines are then aggregated and composed into a search result page (SERP) and presented to the user. For a specific query, multiple verticals could be relevant, which makes the placement of these vertical results within blocks of textual web results challenging: how do we represent, assess, and compare the relevance of these heterogeneous entities? In this paper we present a machine-learning framework for SERP composition in the presence of multiple relevant verticals. First, instead of using the traditional label generation method of human judgment guidelines and trained judges, we use a randomized online auditioning system that allows us to evaluate triples of the form We use a pairwise click preference to evaluate whether the web block or the vertical block had a better users' engagement. Next, we use a hinged feature vector that contains features from the web block to create a common reference frame and augment it with features representing the specific vertical judged by the user. A gradient boosted decision tree is then learned from the training data. For the final composition of the SERP, we place a vertical result at a slot if the score is higher than a computed threshold. The thresholds are algorithmically determined to guarantee specific coverage for verticals at each slot. We use correlation of clicks as our offline metric and show that click-preference target has a better correlation than human judgments based models. Furthermore, on online tests for News and Image verticals we show higher user engagement for both head and tail queries.
AB - Modern web search engines are federated -a user query is sent to the numerous specialized search engines called verticals like web (text documents), News, Image, Video, etc. and the results returned by these engines are then aggregated and composed into a search result page (SERP) and presented to the user. For a specific query, multiple verticals could be relevant, which makes the placement of these vertical results within blocks of textual web results challenging: how do we represent, assess, and compare the relevance of these heterogeneous entities? In this paper we present a machine-learning framework for SERP composition in the presence of multiple relevant verticals. First, instead of using the traditional label generation method of human judgment guidelines and trained judges, we use a randomized online auditioning system that allows us to evaluate triples of the form We use a pairwise click preference to evaluate whether the web block or the vertical block had a better users' engagement. Next, we use a hinged feature vector that contains features from the web block to create a common reference frame and augment it with features representing the specific vertical judged by the user. A gradient boosted decision tree is then learned from the training data. For the final composition of the SERP, we place a vertical result at a slot if the score is higher than a computed threshold. The thresholds are algorithmically determined to guarantee specific coverage for verticals at each slot. We use correlation of clicks as our offline metric and show that click-preference target has a better correlation than human judgments based models. Furthermore, on online tests for News and Image verticals we show higher user engagement for both head and tail queries.
KW - Federated web search
KW - Heterogeneous verticals
KW - Machine learning
KW - Pairwise preference from clicks
KW - Randomized flights
UR - http://www.scopus.com/inward/record.url?scp=79952424525&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/1935826.1935922
DO - https://doi.org/10.1145/1935826.1935922
M3 - منشور من مؤتمر
SN - 9781450304931
T3 - Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011
SP - 715
EP - 724
BT - Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011
Y2 - 9 February 2011 through 12 February 2011
ER -