NBA Data Analysis Using Python & Machine Learning

randerson112358
9 min readJun 30, 2019

Explore NBA Basketball Data Using KMeans Clustering

In this article I will show you how to explore data and use the unsupervised machine learning algorithm called KMeans to cluster / group NBA players. The code will explore the NBA players from 2013–2014 basketball season and use KMeans to group them in clusters to show which players are most similar.

K-Means is one of the most popular “clustering” algorithms. K-means stores ‘k’ centroids that it uses to define clusters. A point is considered to be in a particular cluster if it is closer to that cluster’s centroid than any other centroid. K-Means finds the best centroids by alternating between (1) assigning data points to clusters based on the current centroids (2) chosing centroids (points which are the center of a cluster) based on the current assignment of data points to clusters. — Chris Piech[1]

KMeans Graph where k=3 clusters

The KMeans algorithm will categorize the items into k groups of similarity. To calculate that similarity, we will use the euclidean distance as measurement.

The K-Means algorithm works as follows:

  1. First we initialize k points, called means…

--

--