TY - GEN
T1 - Identifying web queries with question intent
AU - Tsur, Gilad
AU - Pinter, Yuval
AU - Szpektor, Idan
AU - Carmel, David
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Vertical selection is the task of predicting relevant verticals for a Web query so as to enrich the Web search results with complementary vertical results. We investigate a novel vari-ant of this task, where the goal is to detect queries with a question intent. Specifically, we address queries for which the user would like an answer with a human touch. We call these CQA-intent queries, since answers to them are typi-cally found in community question answering (CQA) sites. A typical approach in vertical selection is using a vertical's specific language model of relevant queries and computing the query-likelihood for each vertical as a selective criterion. This works quite well for many domains like Shopping, Lo-cal and Travel. Yet, we claim that queries with CQA intent are harder to distinguish by modeling content alone, since they cover many difierent topics. We propose to also take the structure of queries into consideration, reasoning that queries with question intent have quite a difierent struc-ture than other queries. We present a supervised classi-cation scheme, random forest over word-clusters for variable length texts, which can model the query structure. Our experiments show that it substantially improves classiffca-tion performance in the CQA-intent selection task compared to content-oriented based classification, especially as query length grows.
AB - Vertical selection is the task of predicting relevant verticals for a Web query so as to enrich the Web search results with complementary vertical results. We investigate a novel vari-ant of this task, where the goal is to detect queries with a question intent. Specifically, we address queries for which the user would like an answer with a human touch. We call these CQA-intent queries, since answers to them are typi-cally found in community question answering (CQA) sites. A typical approach in vertical selection is using a vertical's specific language model of relevant queries and computing the query-likelihood for each vertical as a selective criterion. This works quite well for many domains like Shopping, Lo-cal and Travel. Yet, we claim that queries with CQA intent are harder to distinguish by modeling content alone, since they cover many difierent topics. We propose to also take the structure of queries into consideration, reasoning that queries with question intent have quite a difierent struc-ture than other queries. We present a supervised classi-cation scheme, random forest over word-clusters for variable length texts, which can model the query structure. Our experiments show that it substantially improves classiffca-tion performance in the CQA-intent selection task compared to content-oriented based classification, especially as query length grows.
KW - Question intent
KW - Vertical selection
UR - http://www.scopus.com/inward/record.url?scp=84980354923&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/2872427.2883058
DO - https://doi.org/10.1145/2872427.2883058
M3 - Conference contribution
T3 - 25th International World Wide Web Conference, WWW 2016
SP - 783
EP - 793
BT - 25th International World Wide Web Conference, WWW 2016
T2 - 25th International World Wide Web Conference, WWW 2016
Y2 - 11 April 2016 through 15 April 2016
ER -