Build A Bitcoin Price Prediction Program Using Machine Learning And Python
Disclaimer: The material in this article is purely educational and should not be taken as professional investment advice. Invest at your own discretion.
It is extremely hard to try and predict the direction of the stock market and stock price, but in this article I will give it a try. Even people with a good understanding of statistics and probabilities have a hard time doing this. So, please keep this in mind while reading through this article.
In this article I will show you how to build your own Python program to predict the price of Bitcoin (BTC) using a machine learning technique called Support Vector Machine. So you can start trading and making money ! Actually this program is really simple and I doubt any major profit will be made from this program, but it may be slightly better than guessing!
In the program we will use the Support Vector Regression function which is a type of Support Vector Machine. A Support Vector Regression (SVR) is a type of supervised learning algorithm that analyzes data for regression analysis. In 1996, this version of SVM for regression was proposed by Christopher J. C. Burges, Vladimir N. Vapnik, Harris Drucker, Alexander J. Smola and Linda Kaufman. The model produced by SVR depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction.
Support Vector Machine Pros:
- It is effective in high dimensional spaces.
- It works well with clear margin of separation.
- It is effective in cases where number of dimensions is greater than the number of samples.
Support Vector Machine Regression Cons:
- It does not perform well, when we have large data set.
- Low performance if the data set is noisy ( a large amount of additional meaningless information).
Types Of Kernel:
- radial basis function (rbf)
What Is Bitcoin And Who Created It ?
Bitcoin is a digital currency and a payment system invented by an unknown group or person by the name Satoshi Nakamoto , who published the invention in 2008 and released it as open source software in 2009. It is the first decentralized digital currency, meaning the system works without a single administrator or central bank, you can use them in every country, your account cannot be frozen, and there are no prerequisites or limits .
Bitcoins are transferred directly from person to person, also known as peer to peer. The cryptographic transactions are verified by a network of people and recorded in a public distributed ledger called the block chain . Note that once a payment is made it can’t be reversed, and if you lose your wallet, you lose your bitcoins.
How Much Is One Bitcoin Worth ?
Bitcoin is very volatile, the price of one bitcoin is liable to change rapidly and unpredictably. Earlier in January 2017 one bitcoin was equivalent to $985 USD. If you had invested $100 USD in 2010, you would be worth about $72 million USD as of (12/21/2017).
You can read more about Bitcoin here.
If you prefer not to read this article and would like a video representation of it, you can check out the YouTube Video below. It goes through everything in this article with a little more detail, and will help make it easy for you to start programming your own Machine Learning model even if you don’t have the programming language Python installed on your computer. Or you can use both as supplementary materials for learning about Machine Learning !
If you are also interested in reading more on machine learning to immediately get started with problems and examples then I strongly recommend you check out Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. It is a great book for helping beginners learn how to write machine learning programs, and understanding machine learning concepts.
I will start by stating what I want this program to do. I want this program to predict the prices of Bitcoin 30 days in the future based off of the current price.
#Description: This program predicts the price of Bitcoin for the next 30 days
Import the libraries.
import numpy as np
import pandas as pd
#Load the data
from google.colab import files # Use to load data on Google Colab
uploaded = files.upload() # Use to load data on Google Colab
Store the data set into a variable.
#Store the data into the variable df
df = pd.read_csv('BitcoinPrice.csv')
Remove the date column from the dataframe.
#Remove the Date column
df.drop(['Date'], 1, inplace=True)
Show the first 7 rows of the new data set.
#Show the first 7 rows of the new data set
Create a variable for predicting ’n’ days out into the future (’n’ is an arbitrary integer), and a column called ‘prediction’ that will contain the price of Bitcoin 30 days from the current price (the price of BTC in the Price column).
#A variable for predicting 'n' days out into the future
prediction_days = 30 #n = 30 days
#Create another column (the target or dependent variable) shifted 'n' units up
df['Prediction'] = df[['Price']].shift(-prediction_days)
Show the first 7 rows of the new data set.
#Show the first 7 rows of the new data set
To demonstrate what we accomplished above by shifting the values up 30 days, the last 30 rows don’t have any values. We will show this by printing the last 7 rows of the new data set.
#Show the last 7 rows of the new data set
Create the independent data set. This is the set that contains the features to make the future predictions with. First we will convert the dataframe to a numpy array and drop the prediction column, then we will remove the last ’n’ rows where from the data set. In this article that means we will remove the last 30 days since ’n’ = prediction_days , which equals 30.
#CREATE THE INDEPENDENT DATA SET (X)
# Convert the dataframe to a numpy array and drop the prediction column
X = np.array(df.drop(['Prediction'],1))
#Remove the last 'n' rows where 'n' is the prediction_days
Create the dependent data set, this will be the data set that contains our target, that’s the data that we are trying to predict. We will accomplish this by converting the dataframe to a numpy array and getting all of the values from the prediction column of the dataframe.
Then we will get all of the values from the created data set except for the last ’n’ rows.
#CREATE THE DEPENDENT DATA SET (y) # Convert the dataframe to a numpy array (All of the values including the NaN's) y = np.array(df['Prediction']) # Get all of the y values except the last 'n' rows
y = y[:-prediction_days]
Split the data into 80% training and 20% testing data sets.
# Split the data into 80% training and 20% testing
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
Create a variable called prediction_days_array and set it equal to the last 30 rows of the original data set from the price column.
# Set prediction_days_array equal to the last 30 rows of the original data set from the price columnprediction_days_array = np.array(df.drop(['Prediction'],1))[-prediction_days:]print(prediction_days_array)
Create the Support Vector Regression model using the radial basis function (rbf), and train the model.
from sklearn.svm import SVR# Create and train the Support Vector Machine
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.00001)#Create the model
svr_rbf.fit(x_train, y_train) #Train the model
Test the models accuracy on the testing data sets.
# Testing Model: Score returns the accuracy of the prediction.
# The best possible score is 1.0
svr_rbf_confidence = svr_rbf.score(x_test, y_test)
print("svr_rbf accuracy: ", svr_rbf_confidence)
Print the models predicted values for the test data set and the actual values of the price of Bitcoin.
# Print the predicted value
svm_prediction = svr_rbf.predict(x_test)
#Print the actual values
Now, print the model’s Bitcoin price predictions for the next 30 days.
# Print the model predictions for the next 'n=30' days
svm_prediction = svr_rbf.predict(prediction_days_array)
Print the actual price of bitcoin for the next 30 days.
#Print the actual price for the next 'n' days, n=prediction_days=30
Conclusion and Resources
That is it, you are done creating your program to predict the price of bitcoin ! Thanks for reading this article I hope its helpful to you all ! If you enjoyed this article and found it helpful please leave some claps to show your appreciation. Keep up the learning, and if you like machine learning, mathematics, computer science, programming or algorithm analysis, please visit and subscribe to my YouTube channels (randerson112358 & compsci112358 ).
Python is considered an easy high level language to learn. Considering it’s one of the fastest growing programming languages and a programming language many companies and computer science departments use, it is definitely a language you want to get familiar with. Get a comprehensive, in-depth introduction to the core Python language with this hands-on book. Based on author Mark Lutz’s popular training course, this updated fifth edition will help you quickly write efficient, high-quality code with Python. It’s an ideal way to begin, whether you’re new to programming or a professional developer versed in other languages.