Build A Book Recommendation System Using Python & Machine Learning

Build a Book Recommender Using the Python Programming Language

In this article, I will show you how to create your own book recommendation system using the python programming language and machine learning. The purpose of this Book Recommendation Engine, or “Book Recommender” is to recommend interesting books that the user may like.

A recommendation system sometimes called a recommender engine is one of the most powerful marketing tools in todays digital world! Big companies from Netflix, to Amazon use these recommendation engines to suggest products and services to their customers.

Speaking of recommendation systems, I recommend you read the Harry Potter series if you enjoy fantasy, it is the highest-selling series of novels ever written by J.K. Rowling (as of Feb. 10, 2021). I want to emphasize again these novels are the highest -selling series of novels EVER. The series of books were so popular that they inspired 8 movies based on the novels and even the creation of the Wizarding World of Harry Potter sections at Universal Studios Theme Parks. The novels have sold over 500 million copies! So, if you are interested in fantasy then I RECOMMEND you check out the Harry Potter Series. Okay, now let’s talk more about recommendation engines.

Harry Potter Series

About Recommendation Engines

A recommendation engine, also known as a recommender system, is software that analyzes available data to make suggestions for something that a user might be interested in.

A recommendation engine can be used for recommending other products besides books; for example a recommendation engine can be used to suggest movies or t-shirts or any other product based on things like similar customers who bought similar products. On Amazon’s ecommerce website, you can see an area populated by a recommendation engine, it’s the area that you see “Customers who viewed this item also viewed” and “Customers who bought this item also bought” lists.

There are basically four types of recommendation engines:

  1. Content based recommendation engine
  2. Collaborative filtering based recommendation engine
  3. Popularity based recommendation engine
  4. Hybrid recommendation engine

Content based recommendation engine: Content based recommendation engines (the engine that we will use in this article) is a recommendation system that takes content or attributes of a product you like, for example a novels genre, author, title, publisher etc. , and then ranks other products based on how similar they are to the liked product, in this case we rank different books based on how similar the recommended books are to the liked book using something called similarity scores.

Collaborative filtering based recommendation engine: Collaborative filtering based recommendation engine is a family of algorithms that tries to find similar users based on similar preferences, actions and activities . It then looks at the novels for one user and recommends it to a similar user. Let’s take for example user A who is similar to user B, we know they are both similar because they both like the same video games, comic books, etc., if user A bought a book then user B may also be interested in reading that same book. This type of recommendation engine would recommend that book to user B.

Collaborative filtering based recommendation engine example

Popularity based recommendation engine: Popularity based recommendation engine is a recommendation engine based off of how popular some product or item is. For example a popularity based recommendation engine would take the view counts for a book or novel and then list the book or novel from the highest view count to the lowest view count. Netflix and YouTube trending list uses this type of recommendation engine or at least a similar one. This is also considered one of the simplest recommendation engines to implement.

Image of YouTube trends popular recommendation

Hybrid recommendation engine: Hybrid recommendation systems are a combination of two or more types of recommendation systems, and can be more effective then using the engines separately according to recent research. It is likely that Google uses this type of recommendation engine to find similar movies .

Hybrid based recommendation engine example

In this article we will be creating a content based recommendation engine using Python and machine learning.

Check out the video below it goes through the program step by step, and will help make it easy for you to start programming your own book recommendation engine/system even if you don’t have the programming language Python installed on your computer.

How I Programmed The Book Recommendation System:

The idea of this program or recommendation system was to find a book that the user likes by getting the book title, and then find similar books in the data set that the user will like based on some criteria.

The first thing that I needed to do to create a book recommendation system was to gather a lot of data about books. The data set used contains over 10,000 rows of book data.

Showing 4 rows of data from the data set and all of the columns.

Next, I had to decide on what criteria or which columns I thought would be helpful when determining suggested books that the user may like. I decided to use only 3 columns from the data set: title, authors, and publisher.

#Create a list of columns to keep that are important
columns = ['title', 'authors', 'publisher']

Once those important columns were identified, I created a function to concatenate the data from all three columns together and then used that function to store the data (combined features) into a new column.

#Create a new column with the combined features
df['combined_features'] = combine_features(df)
A sample of the new data set with the new column

Then I used the cosine similarity function to get the similarity scores of each book. A value of 1 let’s me know that the current book at that row has a 100% similarity score to the book at that column.

#Get the cosine similarity matrix from the count matrix# This will give us a nxn matrix of similarity scores for each book (row of data) to every other book in the data set (the columns) including itself.
cs = cosine_similarity(cm)
#Print the similarity score
The matrix of cosine similarity scores

I then got the title of the book that the user likes.

#Get the title of the book the reader likes
Title = df['title'][1]
#Show the title
The title of the book that the user likes

Once I had the title of the book that the user liked, I was able to find the row that contained that books similarity scores to every other book in the data set and printed those scores in descending order from greatest to least after removing the book score to itself from the the similarity scores.

#Sort the list of similar books in descending order
sorted_scores = sorted(scores, key= lambda x:x[1], reverse= True)
sorted_scores = sorted_scores[1:]
#Show the sorted scores
Sample of the similarity scores to the book the user likes

Finally, I took those similarity scores and found the book title associated with them to print the 5 most recommended books in the data set to the user. Remember the book that the user liked was ‘Harry Potter and the Order of the Phoenix (Harry Potter #5)’.

The recommendation engine seems to be working well. It is recommending other Harry Potter books to the user, but of course this program can be improved upon. For example, it looks like the system recommended the book ‘Harry Potter and the Chamber of Secrets (Harry Potter #2)’ twice, this tells me that this data set contains multiple rows with the same title. In the future, I can make this program better by eliminating the rows with the same title.

If you are interested in reading more about python and machine learning, then to immediately get started with problems and examples, I recommend you read Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems.

It is a great book for helping beginners learn to write machine-learning programs and understanding machine-learning concepts.

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

Thanks for reading this article, I hope it’s helpful to you!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store