Some researchers in the field might be surprised to hear that the predominant industry view is that predicting ratings for items isn't a very useful way to build or evaluate a recommender system. This is partly because "users don't get ratings", as Palash Nandy from Google put it at the opening panel, "they always rate 1 or 5", and partly because improving prediction of users' ratings is at best weakly aligned with the business aims of most systems. In fact all of the industry people I spoke to were suspicious of offline evaluation metrics for recommendations in general. The YouTube team had even compared some standard offline metrics evaluating two recommendation algorithms with results from online controlled experiments on their live system. They found no correlation between the two: this means that the offline metrics had no value whatsoever in telling which algorithm would do better in practice. Finally there was a strong industry concern that the design of the Netflix Prize competition had led many researchers into developing methods that could not possibly scale to real world scenarios. Most industry people I spoke to described scalability as their number one consideration.
There were two strong beliefs that I heard stated many times in the last few days: improvements in the overall user experience dominate likely enhancements of current algorithms; and recommendations from friends remain far more powerful than those from machines. So the sort of research that will excite industry in the near future is likely to focus on topics such as:
- providing more compelling explanations for machine recommendations
- exploiting social relationships to make better recommendations
- optimising the user's interaction with particular recommenders
One final thing that might interest researchers with concerns about any of these issues: pretty much everyone said that they were hiring.