Unleashing the Power of OpenAI and Python for Better Stock Trading

7 min readJul 8, 2024

As we delve into the realm of trading, one fundamental truth remains clear — experience is the ultimate teacher. This principle, vital in human learning, is equally significant for machines. Picture an algorithm that gains knowledge from experience, continually adapting and evolving through each encounter; this concept is known as reinforcement learning.

The idea of developing a trading bot has intrigued researchers and enthusiasts for a long time. There are numerous methods to train models for making effective decisions in the dynamic environment of real-time market conditions. Entrusting an algorithm to make trading decisions is a significant step that requires extensive research and ample data. The EODHD API offers a comprehensive and real-time dataset, forming the foundation for applications in reinforcement learning.

In this article, we will set out on a journey to create a trading bot powered by machine learning. We will investigate how the principles of reinforcement learning can be applied to develop an intelligent trading assistant.

Importing the Necessary Libraries

To kick off our exciting journey, we need to gather and introduce the essential tools required for the tasks ahead. These diverse packages each serve a specific purpose:

eodhd:

- The official Python library from EODHD that provides a seamless, programmatic way to access EODHD’s APIs.

TensorFlow:

- A robust machine learning library used for building and training neural networks. It is perfect for tasks such as deep learning, natural language processing, and computer vision.

Stable Baselines3:

- A library offering reliable implementations of reinforcement learning algorithms in Python. This includes A2C, a popular algorithm for solving sequential decision-making problems.

Gym and Gymnasium:

- OpenAI Gym is a toolkit designed for developing and comparing reinforcement learning algorithms, providing a variety of environments for testing.

Gym Anytrading:

- An extension of OpenAI Gym, specifically tailored for reinforcement learning in financial trading. It supports the development and testing of algorithms in the realm of algorithmic trading.

Processing Libraries (NumPy, Pandas, Matplotlib)

NumPy:

- Facilitates efficient numerical operations and supports multi-dimensional arrays, making complex calculations straightforward.

Pandas:

- Simplifies data manipulation with its powerful DataFrame structures, allowing for easy data analysis and handling.

Matplotlib:

- A versatile plotting library used to create a wide range of visualizations, essential for data presentation and analysis.

import gymnasium as gym
import gym_anytrading

# Stable baselines - rl stuff
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3 import A2C

# Processing libraries
import numpy as np

import pandas as pd
from matplotlib import pyplot as plt

from eodhd import APIClient

API Key Activation

To utilize the functions of the EODHD API, you must first register and activate your API key. Follow these steps:

1. Register on EODHD:

- Visit the EODHD website and complete the registration process to create an account.

- Navigate to the Dashboard where you will find your secret EODHD API key. Ensure this key remains confidential.

2. Activate the API Key:

- Use the following code to activate your API key:

# API KEY ACTIVATION

api_key = '<YOUR API KEY>'
client = APIClient(api_key)

- In the code above, the first line stores your EODHD API key in the `api_key` variable. The second line utilizes the `APIClient` class from the `eodhd` package to activate the API key and store the response in the `client` variable.

Remember to replace `<YOUR API KEY>` with your actual EODHD API key. For added security, consider storing the API key in environment variables instead of directly in the code.

Extracting Historical Data

Before diving into data extraction, it’s essential to understand historical or end-of-day (EOD) data. Historical data, gathered over time, helps identify patterns and trends, providing insights into market behavior. You can extract Tesla’s historical data using the `eodhd` package with the following code:

# EXTRACTING HISTORICAL DATA

df = client.get_historical_data('TSLA', 'd', "2018-11-26","2023-11-24")
df =  df[['open','high','low','close','volume']]
df.columns = ['Open','High','Low','Close','Volume']
df.tail()

In this straightforward code:

1. The `get_historical_data` function from the `eodhd` package retrieves the historical data. It requires the stock symbol (`TSLA`), the interval (`d` for daily data), and the start and end dates.

2. The resulting dataframe is then manipulated to keep only the relevant columns: `open`, `high`, `low`, `close`, and `volume`.

3. Finally, the columns are renamed to `Open`, `High`, `Low`, `Close`, and `Volume` for better readability.

This process results in a clean and formatted dataframe containing Tesla’s historical stock data:

Using the Trading Gym

This step involves creating and exploring a trading environment using AnyTrading, which combines OpenAI Gym for trading.

env = gym.make('stocks-v0', df=df, frame_bound=(5, len(df)-1), window_size=5)

This command generates the default trading environment. You can modify parameters like the dataset or `frame_bound` to customize the environment to suit your needs.

state = env.reset()
env.render()

In this snippet:

1. `gym.make(‘stocks-v0’, df=df, frame_bound=(5, len(df)-1), window_size=5)` creates the trading environment using the dataframe `df`. The `frame_bound` parameter defines the starting and ending points for the data, and `window_size` sets the number of previous data points used to form the current state.

2. `state = env.reset()` initializes the environment.

3. `env.render()` visualizes the environment, providing a graphical representation of the current state of the trading simulation.

The output from the above commands provides a visual and numerical representation of how the environment processes and perceives the dataset.

Training and Predicting the Model

The environment created previously will be used in our model for training and later for real-world applications in this step.

state = env.reset()
while True:
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

    # env.render()
    if done:
        print("info:", info)
        break

plt.cla()
env.unwrapped.render_all()
plt.show()

In this process, we categorize observations, rewards, termination status, truncation, and additional information from the data, then train the model for each scenario.

During each iteration, the model makes a random trading decision within the environment, receiving feedback through observations, rewards, and the environment’s status. This loop continues until the trading episode either terminates or truncates.

Optionally, a visual representation of the trading scenario can be rendered, providing a helpful display. After each episode, the relevant information is printed for evaluation. The final lines clear the plot, render the complete trading environment, and show the updated visualization. This method offers valuable insights into the model’s learning progress and performance within the simulated trading environment.

Short and long positions are visually indicated with red and green colors. Note that the starting position of the environment is always set to 1.

env_maker = lambda: gym.make('stocks-v0', df=df, frame_bound=(5,100), window_size=5)
env = DummyVecEnv([env_maker])

This snippet sets up a reinforcement learning environment for stock trading using OpenAI Gym. The lambda function `env_maker` generates an instance of the ‘stocks-v0’ environment, tailored with a DataFrame (df) for market data and specific time frame parameters. Utilizing the `DummyVecEnv` class, a vectorized environment is created and assigned to the variable `env`. This environment is now ready for training reinforcement learning models.

model = A2C('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=1000000)

An A2C (Advantage Actor-Critic) reinforcement learning model is implemented using the ‘MlpPolicy’ on the previously configured stock trading environment. The model is trained for a total of one million time steps using the `learn` method.

Following the training phase, a new stock trading environment is established with a modified time frame. The trained model interacts with this environment in a loop, making predictions and executing actions. The loop continues until the trading episode concludes, and the episode details are printed.

Lastly, a plot depicting the entire trading environment is generated and presented for visual assessment. This script serves as a foundational framework for training and evaluating a reinforcement learning model tailored for stock trading.

env = gym.make('stocks-v0', df=df, frame_bound=(10,1100), window_size=5)
obs = env.reset()
obs = obs[0]
while True:
    obs = obs[np.newaxis, ...]
    action, _states = model.predict(obs)
    observation, reward, terminated, truncated, info = env.step(action)
    if done:
        print("info", info)
        break

plt.figure(figsize=(15,6))
plt.cla()
env.render_all()
plt.show()

The graph displays the predicted outcomes for the specified time frame, offering a comparison with the initial graph obtained at the beginning of this process. It provides valuable insights into the model’s decision-making process and reveals the market dynamics over the specified period based on its learnings.

Conclusion

The idea of entrusting decisions to algorithms based on their experiences and knowledge is both fascinating and promising. Exploring AI, particularly reinforcement learning in finance, holds immense potential and is far from being fully exploited. This field offers extensive possibilities for evolution into something significantly impactful and transformative in financial trading and beyond.

The original article was published in the EODHD Academy by Nikhil Adithyan.

Please note that this article is for informational purposes only and should not be taken as financial advice. We do not bear responsibility for any trading decisions made based on the content of this article. Readers are advised to conduct their own research or consult with a qualified financial professional before making any investment decisions.

For those eager to delve deeper into such insightful articles and broaden their understanding of different strategies in financial markets, we invite you to follow our account and subscribe for email notifications.

Stay tuned for more valuable articles that aim to enhance your data science skills and market analysis capabilities.