- the input data is a large matrix with a row for each user and a column for each item
- entries in the matrix represent ratings or preference scores
- the model represents both users and items as vectors in a common low-dimensional space
- after training unknown entries in the matrix can be predicted by taking the inner product of the corresponding user and item vectors

If you want to make recommendations for a user, step 4 lets you predict their rating for all possible items. You then simply return the few items with the highest predicted ratings.

Did you notice that one small but significant detail here has been "left as an exercise for the reader"? If you have any significant number of items in your catalogue then predicting and sorting ratings for all of them is going to take rather a long time - far too long for most practical applications.

An elegant paper by Parikshit Ram and Alexander Gray presented at KDD this year offers a simple solution to this problem. Maximum Inner-Product Search using Tree Data-Structure shows how to construct and use a ball tree designed specifically to find vectors maximising the inner product with one or more queries. Ram and Gray start from the observation that this problem is similar but different to that of finding nearest neighbours, the difference being that the inner product is not a genuine distance metric as it doesn't satisfy the triangle inequality.

I've written a very simple Python implementation of their proposed MIP Tree which you can find on github. It's not intended as production code but still shows speedups of several orders of magnitude when I've tried it on real latent factor datasets. Feel free to have a play with it to generate some recommendations - quite efficiently!