NBA Data Analysis Using Python & Machine Learning
Explore NBA Basketball Data Using KMeans Clustering
In this article I will show you how to explore data and use the unsupervised machine learning algorithm called KMeans to cluster / group NBA players. The code will explore the NBA players from 2013–2014 basketball season and use KMeans to group them in clusters to show which players are most similar.
K-Means is one of the most popular “clustering” algorithms. K-means stores ‘k’ centroids that it uses to define clusters. A point is considered to be in a particular cluster if it is closer to that cluster’s centroid than any other centroid. K-Means finds the best centroids by alternating between (1) assigning data points to clusters based on the current centroids (2) chosing centroids (points which are the center of a cluster) based on the current assignment of data points to clusters. — Chris Piech[1]
The KMeans algorithm will categorize the items into k groups of similarity. To calculate that similarity, we will use the euclidean distance as measurement.
The K-Means algorithm works as follows:
- First we initialize k points, called means…