Math Behind Content Based Recommendation System.

Math Behind Content Based Recommendation System.

Concept Behind Content based Recommendation system:

Firstly I would like to give the intuition regarding the content based recommendation system, Like how it Works, in real practices and later We’ll jump into the mathematical part behind it!

Assuming, we have User 1, who had saw movie 1(Action) rated it 5/5, movie 2(Romance) rated it 4/5, and movie 3(Action) rated it 5/5 respectively.

Now, If User 2, watches the Movie 6 (Action) rates it 5/5, and Movie 7(Romance) rates it 5/5, So, the Content based recommendation system will most probably recommend the Action Movie 1 or Action Movie 3 for User 2, based on the ratings and the type of movie, with which both the Users are Related.

In-short, these algorithms try to recommend items that are similar to those that a user liked in the past, or is examining in the present.

Well, this is how Content Based Recommendation system works in a Nutshell, but It is also very important to understand the math behind every Algorithm, so lets Dive into the Math behind this algorithm.

Math behind the Algorithm:

So, lets start with a simple example, assuming the following data,

So the question is, How can we Recommend the Unknown rating of the Users!?

Based the above data, we can see that. Movie 1, Movie 2 and Movie 3 tend to be more Action based Movies, while Movie 4 and Movie 5, tend to be Romantic ones, Also we can conclude that, User 1 and User 2 prefer Action movies over Romantic ones!, and Vice-Versa for the User 3, and User 4 respectively.


N(U)=No. of Users=4,

N(M)=No. of Movies=5, and

N(Features)=2 i.e.(Action and Romance)

So let’s Consider, Movie 1, Assuming the X-intercept value as X(0)=1,and Considering Feature Values, We can write, Feature Vector for Movie 1, as Vector of Matrix(3,1) as [1 0.9 0], Similarly we’ll have Feature Vectors for Movie 2,3,4 and 5.

Now, For Each user “j”, learns a parameter Theta(j)==Real Number^(3) i.e.(Feature(2)+1), So by using

(Theta^(j))^(T)*x^(i), we can find the Rating of Movie(i), using The Parameter Vector Theta(i) for each User, where (i), is no. of user.

Let’s understand this using Better presentation of the concepts,

Math Behind Content Based Recommendation System.

So, based on These Assumed values, for parameter vector for user 1, lets predict rating for Movie 3,

Based on the calculation given below!

So, based on the Calculations, we saw, that the predicted rating for Movie 3, by User 1, would be approximately 4.5 stars!

But, here, We “Assumed” the Parameter vector for the User 1, so How can we get the Actual Parameter Vector, not only for One User, but for the every User We have!

Before that, Lets talk about some terms, which we need to understand first:

Math Behind Content Based Recommendation System.

Now, Let’s see, How can we predict the Actual Value of the Parameter Vector of each User(Which we assumed earlier).

So, To learn Parameter Vector( Theta^(j)), this can be considered as a Linear Regression problem, so that the Actual Parameter Vectors and the Predicted Parameter Vector Values are Close as possible, as the Loss Function,

We can formally Represent this as,

Just like Linear Regression Models, We want to choose that specific Parameter Vector, to minimize the Mean Squared Error term, Now, We can also add the Regularization term, to avoid the Over-fitting,

After Minimizing this term, We can get a pretty Good Estimate of a Parameter Vector of User ‘j’s’ Movie Rating.

So, In general, we can represent the equation as,

Math Behind Content Based Recommendation System.

We wrote the Equation, considering a Single User,

Now, Lets Consider the Equation, for the Multiple Users, We can modify our above Equation, just by adding a Summation, Considering “n” number of Users present, which can be represented as,


We can Represent this Whole Equation i.e. Loss Function, as our “J for (Theta(1),Theta(2),….,Theta(n)”, and this, can be Multiplied with the Appropriate Activation function, and Subtracted from the Predicted Parameter Vector, in order to Minimize the difference between the Actual and the Predicted Parameter Vector.

Hence, We’ll Get better predicted parameter vector for each User, which we’ll use to predict the Rating of the Movies respectively, and this is how, Content Based Recommendation System Work!

I hope you have enjoyed reading. Please be kind enough to like it and you can comment on any of your doubts and queries and we will reply to them at the earliest. Do lookout for more learning in our reading list and subscribe to TryCatchBlog website to learn more.


Happy learning!