Friday, July 23, 2010

[From SIGIR 2010]: Best Paper on Value of Search Trails in Web Logs

We search daily for information; in fact it would not be wrong to say that search has become an integral part of our life on the Web. However while searching for particular information comes a complex range of interactions which varies for different users and different queries and it is these complex interactions that are recently attracting focus of researchers at Microsoft for their Bing search engine and this was the theme of the best paper in SIGIR 2010 "Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs."

The paper itself is very interesting and again it seems that the focus of future Information Retrieval researches would heavily come from Human Computer Interaction as was also obvious from keynote talk in SIGIR 2010.

What happens when you enter a keyword for searching on Google, Yahoo or Bing: a list of Web pages are returned which are ranked based on their relevance which has been computed for much time with the much-renowned PageRank and now variants of PageRank are used for the purpose. Now what do you do with these results? You either follow the different links one after another and finally set to a page that you find to be most satisfying for your query: the entire set of pages followed have been referred to as search trails by White of Microsoft Research and Huang of Washington University and in this research they have studied the value that users derive from this entire activity through a log-based analysis. The researchers collected logs of URL visits of users who opted to provide this data through a widely distributed browser toolbar; the data was collected over a three-month period from March 2009 to May 2009. Formally a search trail is defined a temporally-ordered sequence of URLs beginning with a search query and ending with either: (1) another query, (2) a period of inactivity of 30 or more minutes or (3) termination of browser instance or tab; the figure explains this more clearly:



In the figure the circle represents query along with search engine result page, rectangles represent web pages that user navigates to from the search engine result page, double vertical lines represent backtracking to an earlier state and back arrow shows that user has requested to see a page earlier in search trail. Example in the figure shows a typical example of a search trail with query Q1 initiating the trail and user navigating to page P2 from the results page, then to page P3 and from page P3 to page P4; page P4 does not satisfy the user so he returns to page P3 which is why page P3 has the double vertical lines and then finally navigates to page P5. In this context page P2 is origin page and P5 is destination page.

Currently search engines provide only the origin page in their results, this research aims to study the value derived from following of links so that in the future search engines may offer more refined results for example showing of full trails directly on search results, query-specific and user-specific search results etc. The findings showed that following search trails provides users with significant additional benefit in terms of coverage, diversity, novelty and utility: there is a lot of value in the trail and hence we may see in future recommendation pages in Bing with an integration between the recommendation systems and search engines.

No comments:

Post a Comment