To read this content please select one of the options below:

Classifying the user intent of web queries using k‐means clustering

Ashish Kathuria (Department of Electrical Engineering, The Pennsylvania State University, University Park, Pennsylvania, USA)
Bernard J. Jansen (College of Information Sciences and Technology, The Pennsylvania State University, University Park, Pennsylvania, USA)
Carolyn Hafernik (College of Information Sciences and Technology, The Pennsylvania State University, University Park, Pennsylvania, USA)
Amanda Spink (Faculty of Information Technology, Queensland University of Technology, Brisbane, Australia)

Internet Research

ISSN: 1066-2243

Article publication date: 19 October 2010

1375

Abstract

Purpose

Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries.

Design/methodology/approach

For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k‐means clustering approach based on a variety of query traits.

Findings

The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational.

Research limitations/implications

This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs.

Practical implications

The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research.

Originality/value

This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay‐off for web search engines can be quite beneficial.

Keywords

Citation

Kathuria, A., Jansen, B.J., Hafernik, C. and Spink, A. (2010), "Classifying the user intent of web queries using k‐means clustering", Internet Research, Vol. 20 No. 5, pp. 563-581. https://doi.org/10.1108/10662241011084112

Publisher

:

Emerald Group Publishing Limited

Copyright © 2010, Emerald Group Publishing Limited

Related articles