Stock market prediction is a hot topic nowadays. Because of the big speculation risk, the stock market is highly influenced by the news, such as the policy change caused by the Federal Reserve, the interest rate, and so on. This article describes how to predict US stock price using Python with the help of artificial intelligence technology of deep neural networks to predict US stock prices. We will write a program in python that predicts the movement of the US stock market by using historical data.
Pre-Requisites:
Numpy:
NumPy, short for Numerical Python, is a Python library used for scientific computing and data processing. This library makes it easier to run Python code on arrays and matrices instead of lists. It has many functions to make your mathematics faster.
You can install the Jupyter notebook using the following command in your conda terminal.
pip install numpy
Matplotlib:
Matplotlib is a very extensive library. Matplotlin was created as the graphical user interface for a program named MATLAB. Engineers and data scientists primarily use MATLAB, although it also works well with Python. Since we’re going to create charts and graphs, therefore, we need to install matplotlib.
You can install the Jupyter notebook using the following command in your conda terminal.
pip install Matplotlib
Tensorflow:
TensorFlow is an open-source software library for numerical computation using data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.
You can install the Jupyter notebook using the following command in your conda terminal.
pip install --upgrade tensorflow
Sklearn:
SK-learn is a python library that makes the machine learning process easy to understand
You can install the Jupyter notebook using the following command in your conda terminal.
pip install -U scikit-learn
Pandas_datareader:
Pandas DataReader is a Python package that allows us to create a pandas DataFrame object by using various data sources from the internet. It is popularly used for real-time stock price datasets.
You can install the Jupyter notebook using the following command in your conda terminal.
pip install pandas-datareader
Step -1: Import dependencies
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.feature_selection import SequentialFeatureSelector
from sklearn.model_selection import PredefinedSplit
import pandas_datareader as web
import datetime as dt
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM
Step -2: Load the data
For that, we have to specify from what point we want to take the data to predict and we also defined the ticker symbol you can get the ticker symbol of any company from the google
company = 'FB'
start = dt.datetime(2014,1,1)
end = dt.datetime(2022,1,1)
# define ticker symbol
data = web.DataReader(company, 'yahoo', start, end)
Step -3: Preparing the data
To prepare the data we are not going to use the whole data frame we are only using the closing price.
scaler = MinMaxScaler(feature_range=(0,1))
scaled_data = scaler.fit_transform(data['Close'].values.reshape(-1, 1))
# how many days we want to look at the past to predict
prediction_days = 60
# defining two empty lists for preparing the training data
x_train = []
y_train = []
# we are counting from the 60th index to the last index
for x in range(prediction_days, len(scaled_data)):
x_train.append(scaled_data[x-prediction_days:x, 0])
y_train.append(scaled_data[x, 0])
x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
Step -4: Build the model and specify the layers
Here we are always going to include a single LSTM layer, followed by a dropout layer in the sequence. After that, we are going to have dense layers which will be many units in size and each unit will be the stock price prediction. You can change the number of units used but you need to know that more units mean a longer training time since there is more computation required per layer.
model = Sequential()
# specify the layer
model.add(LSTM(units=50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
# this is going to be a prediction of the next closing value
model.add(Dense(units=1))
Step -5: Compiling the Model
model.compile(optimizer='adam', loss='mean_squared_error')
# fit the model in the training data
model.fit(x_train, y_train, epochs=25, batch_size=32)
Step -6: Testing the model
# Load Test Data
test_start = dt.datetime(2020,1,1)
test_end = dt.datetime.now()
test_data = web.DataReader(company, 'yahoo', test_start, test_end)
Now what we are going to do with the data from this company is that we need to see how predictive it can be. We need to get prices, scale the prices, and then create a total data set consisting of both tested and untested information so firstly, we’ll use actual stock market data which is not related to any predictions made. In the real world the type of data we would use will be closing values and then, what we will do is combine all the information into one big data set to help us make our predictions.
actual_prices = test_data['Close'].values
total_dataset = pd.concat((data['Close'],test_data['Close']), axis=0)
model_input = total_dataset[len(total_dataset)- len(test_data) - prediction_days:].values
# reshaping the model
model_input = model_input.reshape(-1, 1)
# scaling down the model
model_input = scaler.transform(model_input)
Step -7: Predict the next day’s data
x_test = []
for x in range(prediction_days, len(model_input)):
x_test.append(model_input[x-prediction_days:x, 0])
x_test = np.array(x_test)
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
predicted_price = model.predict(x_test)
predicted_price = scaler.inverse_transform(predicted_price)
# plot the test Predictions
plt.plot(actual_prices, color="black", label=f"Actual{company} price")
plt.plot(predicted_price, color='green', label="Predicted {company} Price")
plt.title(f"{company} Share price")
plt.xlabel('Time')
plt.ylabel(f'{company} share price')
plt.legend
plt.show()
Here we are going to use real data as an input to predict the data for the next day.
prediction = model.predict(real_data)
prediction = scaler.inverse_transform(prediction)
print(f"Prediction: {prediction}")
Output:
Final Words
In this blog, we learned how to predict the stock market with Python! Here we took a 60-day long time series of data, then predicted the next day’s data. It’s a little bit of a complicated process but it’s not that hard either. That said, I do not recommend using this for trading. I consider this more a learning experience than anything else. So hope you liked the tutorial and if you have any questions, please feel free to leave them down below and I’ll do my best to answer them!
Here are some useful tutorials that you can read:
- Concurrency in Python
- Basic Neural Network in Python to Make Predictions
- Monitor Python scripts using Prometheus
- How to Implement Google Login in Flask App
- How to create a Word Guessing Game in Python
- Convert an image to 8-bit image
- Schedule Python Scripts with Apache Airflow
- Create Sudoku game in Python using Pygame
- How to Deploy Flask API on Heroku?
- Create APIs using gRPC in Python