Machine Learning Simplified

Image for post
Image for post

Machine learning is a subset of artificial intelligence, it is the science of getting computers to act without being explicitly programmed, and is mostly just statistics. Machine learning is used to find patterns in data that you can then make predictions on. It can be subdivided into supervised learning and unsupervised learning or some mixture of both.

Machine learning is a computer program said to learn from experience ‘E’ with respect to some class of tasks ‘T’ and performance measure ‘P’, if its performance at tasks in ‘T’, as measured by ‘P’, improves with experience ‘E’.
— Tom Mitchell

Image for post
Image for post
How Machine Learning Works

What makes machine learning so amazing ?

The great thing about machine learning is the “magic” component to it. Machine learning can find patterns and make many successful predictions like a persons movie preference, and the expected price of a house. It can do so many useful things like automatically tag faces, diagnose cancer, drive cars, detect fraud, marketing personalization, and voice recognition. There is a great book on machine learning called “Introduction to Machine Learning with Python” if you really want to dig into this subject.

Image for post
Image for post
Introduction to Machine Learning with Python

1. Diagnosis With Machine Learning:

Image for post
Lung Tissue and Cancerous cells Source:

Diagnostic errors contribute to about ten percent (10%) of deaths of patience according to the Institute of Medicine at the National Academies of Science, Engineering and Medicine. The following are some of the causes of the diagnostic errors:

  1. The lack of communication of patients with their families and clinicians.
  2. Failing to make the best use of collaboration and failing to integrate health information technologies also known as (Health IT).
  3. A healthcare work system which, does not adequately support the diagnostic process, by design.

To address these errors many researchers and many companies are using machine learning to make better medical diagnostics. Below are some current applications of Artificial Intelligence (A.I.) and Machine Learning.

  • Oncology: Researchers are using deep learning ( part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms) to train algorithms to identify cancerous tissue as well or better than trained physicians. (Readers with a specific focus on cancer treatments may be interested in reading a full article on deep learning in oncology.)
  • Chatbots: Artificial Intelligent-chatbots that can recognize speech are used to find patterns in patient symptoms and form a potential diagnosis and recommend an appropriate course of action and/or prevent disease. Chatbots can be used to help mental health patients for example.
  • Pathology: Pathology is the medical specialty that is concerned with the diagnosis of disease based on the laboratory analysis of bodily fluids such as blood and urine, as well as tissues. Machine vision and other machine learning technologies can enhance the efforts traditionally left only to pathologists with microscopes.
  • Rare Diseases: Facial recognition software is being combined with machine learning to help clinicians diagnose rare diseases. Patient photos are analyzed using facial analysis and deep learning to detect phenotypes that correlate with rare genetic diseases.

The diagnostics market size is projected to reach $76 billion by 2023. National health expenditures are expected to have reached $3.4 trillion in 2016, and the health share of the GDP is projected to reach nearly 20 percent by 2025.

If you would like to read the full article on Diagnosis with Machine Learning, you can find it here.

Eighty three percent of businesses in North America conduct reviews manually, and on an average, they review twenty nine percent of orders manually, according to Fraud Benchmark Report by cybersource. Computers are much more accurate and faster than humans at processing extremely large sets of data. Computers are able to recognize and detect many patterns on a user’s purchasing habits, so much so that when the pattern is irregular this could be an indication of fraud. Computers can predict fraud in a large volume of transactions by applying machine learning to data. This is another reason why we use machine learning algorithms, to help prevent fraud.

Image for post
Image for post

Where is machine learning being applied today?

Image for post

Machine learning is being used every where today some of the most notable companies that are using it are Facebook, Google, Amazon, Apple, Netflix and FitBit.

Your personal assistant — that’s right, the likes of Siri and Google Now use machine learning, largely to better understand speech patterns. With so many people using Siri, the system is able to seriously advance in how it treats languages, accents, and so on.

Netflix uses machine learning to predict which movie or video you would like to watch to give you better video suggestions. It learns from past videos you have watched or people with similar demographics as you and based on that data, Netflix makes a prediction about the future videos you would like, maybe it is a similar genre or/and has a lot of the same cast that the other videos you have watched had. Netflix also use machine learning for the image, thumbnail or art work to display on its video recommendations. For example Netflix may try to personalize the image used to depict the movie Good Will Hunting. Here they might personalize this decision based on how much a member prefers different genres and themes. For someone who has watched a lot of romantic movies may be interested in Good Will Hunting if they show artwork containing Matt Damon and Minnie Driver on the image, whereas, a user who has watched a lot of comedies might be drawn to the movie if they use the artwork containing Robin Williams, a well-known comedian.

Image for post

Amazon makes good use of machine learning by suggesting products you may like based off of your past purchases, this also allows for more targeted marketing. Amazon saw you bought Harry Potter and the Sorcerer’s Stone and may suggest Harry Potter and the Chamber of Secrets, because the data shows that you are most likely to buy that second book.

Now that Fitbit trackers are ubiquitous in the market and they’re capturing data from millions of individuals, and are leveraging machine learning to provide smart guidance as part of a personalized experience, according to an employee named Raj. Fitbit integrated Fitstar data with Fitbit device data. So, whether users have a proclivity toward cycling or running or using the elliptical or hiking, Fitbit automatically tracks those preferences and uses them to generate a custom workout for the user.

Supervised Learning (Labeled Data)

The Right Answer Is Given

Image for post
Image for post
Supervised Learning Image Source

Supervised learning is a type of machine learning where a model / function is created from labeled data. There is a training set that has input and the desired output. In this type of learning, the correct outcome for each data point is explicitly labeled when training the model / function. This means that the learning algorithm is already given the answer when reading the data. There are two tasks of supervised learning, classification and regression.


Classification is discrete valued output. It assigns a label. For example is this a horse or a tiger ? Another example and probably a more practical example is predicting if a tumor is malignant (harmful)or benign (not harmful).

Support Vector Machines (SVM)

A support vector machine (SVM) is a machine learning algorithm that analyzes data for classification and regression analysis. SVM is a supervised learning method that looks at data and sorts it into one of two categories. An SVM outputs a map of the sorted data with the margins between the two as far apart as possible. SVMs are used in text categorization, image classification, handwriting recognition and in the sciences.

Image for post
Image for post


Regression predicts a continuous numerical value. In statistical modeling, regression analysis is a set of statistical processes for estimating the relationship among variables. For example estimating the value of a house based on it’s size in square footage or predicting a persons annual income based on years of higher education. There are many regression algorithms being used such as linear regression and logistic regression.

Linear Regression

Image for post

Linear regression is a linear model between two variables a dependent variable ‘y’ and an independent variable ‘x’. Does f(x) = mx + b look familiar ? Well that’s not only the slope intercept equation but also the equation for a linear model where f(x) is the estimated dependent variable A.K.A the prediction for the independent variable ‘x’. Linear regression is a common type of predictive analysis.

Image for post
Image for post
A Linear Model

The overall idea of regression is to examine two things:
(1) Does a set of predictor variables do a good job in predicting an outcome (dependent) variable?

(2) Which variables in particular are significant predictors of the outcome variable, and in what way do they indicated by the magnitude and sign of the beta estimates impact the outcome variable?

Logistic Regression

Logistic Regression is a technique used by Machine Learning, and is similar to linear regression in that it is predictive analysis. In statistics, the logistic model is a widely used statistical model that, in its basic form, uses a logistic function a.k.a. a log function to model a binary dependent variable. Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.

Image for post
Image for post
Logistic Regression Model

Logistic regression is emphatically not a classification algorithm on its own. It is only a classification algorithm in combination with a decision rule that makes dichotomous the predicted probabilities of the outcome. Logistic regression is a regression model because it estimates the probability of class membership as a (transformation of a) multi-linear function of the features.

Image for post
Image for post

For example in the below picture, if our threshold was .5 and our prediction function returned .7, we would classify this observation as positive, because it is above our decision boundary. If our prediction was .2 we would classify the observation as negative because it’s below our decision boundary. For logistic regression with multiple classes we could select the class with the highest predicted probability.

Image for post
Image for post

Unlike linear regression which outputs continuous number values, logistic regression transforms its output using the Sigmoid function to return a probability value which can then be mapped to two or more discrete classes (Classification). In order to map predicted values to probabilities, we use the Sigmoid function. The Sigmoid function maps any real value into another value between 0 and 1. In machine learning, we use Sigmoid to map predictions to probabilities. A sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve. Often, sigmoid function refers to the special case of the logistic function

Image for post
Image for post
Sigmoid Function a special case of the logistic function
  • s(z) = output between 0 and 1 (probability estimate)
  • z = input to the function (your algorithm’s prediction e.g. mx + b)

So, one of the nice properties of logistic regression is that the sigmoid function outputs the conditional probabilities of the prediction, the class probabilities. How does it work? Let’s start with the so-called “odds ratio” p / (1 — p), which describes the ratio between the probability that a certain, positive or successful event occurs divided by the probability that it doesn’t occur — where positive refers to the “event that we want to predict”, i.e., p(y=1 | x).

Our current prediction function (Sigmoid) returns a probability score between 0 and 1. In order to map this to a discrete class (true/false, cat/dog), we select a threshold value or tipping point above which we will classify values into class 1 and below which we classify values into class 0.



For example, if our threshold was .5 and our prediction function returned .7, we would classify this observation as positive. If our prediction was .2 we would classify the observation as negative. For logistic regression with multiple classes we could select the class with the highest predicted probability.

Types of logistic regression

  • Binary (Pass/Fail)
  • Multi (Cats, Dogs, Sheep)
  • Ordinal (Low, Medium, High)

Decision Tree

A decision tree is a decision support tool that uses a tree-like model of decisions. A decision tree can be used to visually and explicitly represent decisions and decision making. Think of it in programming terms as a bunch of if and else statements. However the tree is created from the data and decision boundaries on that data as opposed to being explicitly programmed.

Image for post
Image for post

Unsupervised Learning (Unlabeled data)

Here is the data set, can you find some structure

Image for post
Image for post
Unsupervised Learning Image Source


Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups.

k-means clustering

K-Means is a very well known clustering algorithm. It’s taught in a lot of introductory data science and machine learning classes. It’s easy to understand and implement in code!

  1. To begin, we first select a number of classes/groups to use and randomly initialize their respective center points. To figure out the number of classes to use, it’s good to take a quick look at the data and try to identify any distinct groupings. The center points are vectors of the same length as each data point vector and are the “X’s” in the graphic above.
  2. Each data point is classified by computing the distance between that point and each group center, and then classifying the point to be in the group whose center is closest to it.
  3. Based on these classified points, we recompute the group center by taking the mean of all the vectors in the group.
  4. Repeat these steps for a set number of iterations or until the group centers don’t change much between iterations. You can also opt to randomly initialize the group centers a few times, and then select the run that looks like it provided the best results.

K-Means has the advantage that it’s pretty fast, as all we’re really doing is computing the distances between points and group centers; very few computations! It thus has a linear complexity O(n).

Image for post
Image for post

Artificial neural networks are modeled on the human brain and nervous system. An artificial neural network is an interconnected group of nodes, akin to the vast network of neurons in a brain. Here, each circular node represents an artificial neuron and an arrow represents a connection from the output of one artificial neuron to the input of another. This Algorithm is a classification, regression, and clustering algorithm.

Image for post
Image for post
Image from Wikipedia

An artificial neural network is an interconnected group of nodes, akin to the vast network of neurons in a brain. Each circular node represents an artificial neuron and an arrow represents a connection from the output of one artificial neuron to the input of another.

A deep neural network (DNN) is an artificial neural network(ANN) with multiple layers between the input and output layers.

Image result for deep neural networks have 2 or more layers

Where can I Learn More

Here are some of the best free introductory courses on the “interwebs”.

Free online YouTube Videos for CS231n Winter 2016.

Machine Learning Crash Course By Google

Andrew Ng Machine Learning Course on Coursera

Foundations of Data Science

Prediction: Machine Learning

Thanks for reading this article I hope its helpful to you all ! If you enjoyed this article and found it helpful please leave some claps to show your appreciation. Keep up the learning, and if you would like more mathematics, computer science, programming and algorithm analysis videos please visit and subscribe to my YouTube channels (randerson112358 & compsci112358 ).


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store