Item-item collaborative filtering explained

Item-item collaborative filtering, or item-based, or item-to-item, is a form of collaborative filtering for recommender systems based on the similarity between items calculated using people's ratings of those items. Item-item collaborative filtering was invented and used by Amazon.com in 1998.[1] [2] It was first published in an academic conference in 2001.[3]

Earlier collaborative filtering systems based on rating similarity between users (known as user-user collaborative filtering) had several problems:

Item-item models resolve these problems in systems that have more users than items. Item-item models use rating distributions per item, not per user. With more users than items, each item tends to have more ratings than each user, so an item's average rating usually doesn't change quickly. This leads to more stable rating distributions in the model, so the model doesn't have to be rebuilt as often. When users consume and then rate an item, that item's similar items are picked from the existing system model and added to the user's recommendations.

Method

First, the system executes a model-building stage by finding the similarity between all pairs of items. This similarity function can take many forms, such as correlation between ratings or cosine of those rating vectors. As in user-user systems, similarity functions can use normalized ratings (correcting, for instance, for each user's average rating).

Second, the system executes a recommendation stage. It uses the most similar items to a user's already-rated items to generate a list of recommendations. Usually this calculation is a weighted sum or linear regression. This form of recommendation is analogous to "people who rate item X highly, like you, also tend to rate item Y highly, and you haven't rated item Y yet, so you should try it".

Results

Item-item collaborative filtering had less error than user-user collaborative filtering. In addition, its less-dynamic model was computed less often and stored in a smaller matrix, so item-item system performance was better than user-user systems.

Example

Concidering the following matrix :

User - Article Matrix!User!Article 1!Article 2!Article 3
JohnBought itBought itDid not buy
PierreBought itBought itBought it
MaryDid not buyBought itDid not buy
If a user is interested in Article 1, which other item will be suggested to him by a system which is using Amazon's item-to-item algorithm ?

The goal is to propose the user the article with highest cosinus value. This is how we do it :

Firstly, we convert the User-Article matrix into a binary one and we create a simple matrix for each article.

User - Article Matrix (Binary)!User!Article 1!Article 2!Article 3
John110
Pierre111
Mary010

Secondly, we multiply matrix A1 by each matrix in order to find the dot product.

Thirdly, we find the norm of each vector.

\sqrt{12+12+02}

=

\sqrt2

= 1.4142

\sqrt{12+12+12}

=

\sqrt3

= 1.7320

\sqrt{02+12+02}

=

\sqrt1

= 1

Fourthly, we calculate the cosine.

A1*A2
||A1||*||A2||
=
2
\sqrt2*\sqrt3
=
\sqrt6
3
= 0.8165
A1*A3
||A1||*||A3||
=
1
\sqrt2*1
=
1
\sqrt2
=
\sqrt2
2
= 0.7071

Conclusion: If a user is interested in article 1. The algorithm item-to-item will suggest article 2.

Notes and References

  1. Web site: Collaborative recommendations using item-to-item similarity mappings.
  2. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing. 76–80 . 22 January 2003 . 1089-7801 . 7 . 1 . 10.1109/MIC.2003.1167344. G . Linden . B . Smith . J . York. 14604122 .
  3. Book: 285–295 . 2001 . 978-1-58113-348-6 . 10.1145/371920.372071. Badrul . Sarwar . George . Karypis . Joseph . Konstan. John . Riedl . Proceedings of the 10th international conference on World Wide Web . Item-based collaborative filtering recommendation algorithms . John Riedl. ACM. 10.1.1.167.7612. 8047550.