Can You Predict The Rise And Fall Of Stocks Using Machine Learning & Python ?
Can you predict if a stock price will increase or decrease using python and machine learning ?
First let me say it is extremely hard to try and predict the stock market. Even people with a good understanding of statistics and probabilities have a hard time doing this. Stock market prediction is the act of trying to determine the future value of a company stock or other financial instrument traded on an exchange. — Wikipedia
However with all of that being said, if you are able to successfully predict the price of a stock, you could gain an incredible amount of profit. I also want to make it very clear that I am not a finance advisor and take this article with a grain of salt.
In this article, I will attempt to predict if tomorrows close price for a stock will rise or fall below the previous trading day close price using a machine learning algorithm called a Decision Tree. Basically I want to know if tomorrows close price is going to be higher or lower than today's close price.
Let’s take a look at some pros and cons of using a decision tree classifier.
Decision Tree Pros:
- Simple to understand and to interpret
- List Requires little data preparation
Decision Tree Cons:
- Prone to over-fitting
- Decision trees can be unstable (a small variation in the data may result in a completely different tree being generated)
If you are interested in reading more on machine learning and algorithmic trading then you might want to read Hands-On Machine Learning for Algorithmic Trading: Design and implement investment strategies based on smart algorithms that learn from data using Python. The book will show you how to implement machine learning algorithms to build, train, and validate algorithmic models. It will also show you how to create your own algorithmic design process to apply probabilistic machine learning approaches to trading decisions, and the book will show you how to develop neural networks for algorithmic trading to perform time series forecasting and smart analytics.
If you prefer not to read this article and would like a video representation of it, you can check out the YouTube Video below. It goes through everything in this article with a little more detail, and will help make it easy for you to start programming your own Machine Learning model even if you don’t have the programming language Python installed on your computer. Or you can use both as supplementary materials for learning about Machine Learning !
I will first start the program with a description about the program in comments.
# This program uses a machine learning algorithm called a Decision Tree to try to predict if the price of a stock will increase or decrease
Next, I will import the libraries that I plan on using throughout the program.
#Import the libraries
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import numpy as np
import pandas as pd
Now let’s load the data. Since I am using the website https://colab.research.google.com/ , I need to use that sites library to upload the data.
The data contains stock data for Amazon from 3/2/2020 to 5/1/2020.
#Load the data
from google.colab import files # Use to load data on Google Colab
uploaded = files.upload() # Use to load data on Google Colab
Store the data into a variable, set the date as the index, give the index a name, and show the data frame.
#Store the data into the df variable
df = pd.read_csv('AMZN.csv')
#Set the date as the index for the data
df = df.set_index(pd.DatetimeIndex(df['Date'].values))
#Give the index a name
df.index.name = 'Date'
#Show the dataframe
Time to manipulate the data set by creating the target column (Price_Rise),and removing the date column.
#Manipulate the data set
#Create the target column
df['Price_Up'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0) # if tomorrows price is greater than todays price put 1 else put 0
#Remove the date column
remove_list = ['Date']
df = df.drop(columns=remove_list)
#Show the data
Next, split the data set into a feature or independent data set (X) and a target or dependent data set (Y).
#Split the data set into a feature or independent data set (X) and a target or dependent data set (Y)
X = df.iloc[:, 0:df.shape -1].values #Get all the rows and columns except for the target column
Y = df.iloc[:, df.shape-1].values #Get all the rows from the target column
Split the data again but this time into 80% training and 20% testing data sets.
#Split the data again but this time into 80% training and 20% testing data sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2)
Create and train the model.
#Create and train the decision tree Classifier model
tree = DecisionTreeClassifier().fit(X_train, Y_train)
Check how well the model did on the training data set.
#Check how well the model did on the training data set
print( tree.score(X_train, Y_train))
It looks like the model got a score of 100% on the training data set.
Now, let’s check how well the model did on the test data set.
#Check how well the model did on the test data set
print( tree.score(X_test, Y_test))
It looks like the model got a score of about 66.66%, not bad, this is slightly better than guessing.
Show the model predictions.
#Show the model tree predictions
tree_prediction = tree.predict(X_test)
print( tree_prediction )
Show the actual values from the test data set.
#Show the actual values from the test data set
Now, we can see visually that the model got about 3 out of 9 of the actual values are incorrect. A score of 66.66% isn’t great, but again it’s better than guessing. This model could definitely use some improvement and keep in mind that this was tested on a small data set. More data , more testing and parameter tuning needs to be done.
That’s it, we are done creating this program ! If you want to start an investment portfolio, then sign up with WeBull using this link, deposit $100 or more and get 2 extra FREE stocks worth up to $1600 USD! It’s free stocks that you can either sell, play with or create your own trading strategy with.
If you want more practice on machine learning, then I strongly recommend you check out Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. It is a great book for helping beginners learn how to write machine learning programs, and understanding machine learning concepts.
Thanks for reading this article I hope its helpful to you all ! If you enjoyed this article and found it helpful please leave some claps to show your appreciation. Keep up the learning, and if you like machine learning, mathematics, computer science, programming or algorithm analysis, please visit and subscribe to my YouTube channels (randerson112358 & compsci112358 ).