Simplifying Trading Decisions with Live Data

How to enhance stock market decision-making using ML models like XGBoost and Random Forest in Python.

6 min readSep 17, 2024

Have you ever wondered how to make smarter decisions when investing in the stock market? I embarked on that journey and stumbled upon a powerful tool called the Live (Delayed) Stock Prices API, which offers near real-time stock data with just a 15-minute delay. This revelation inspired me to create my own stock market assistant. With access to live stock information, you can make well-informed decisions without needing to rely on expensive platforms. Plus, you can develop an investment strategy based on historical data to ensure you’re making the best possible choices.

In my search for a quick and reliable way to access up-to-date stock prices and trends, most solutions fell short — except for this API, which stood out by offering real-time stock data.

Here’s a glimpse of what this API provides when you request data for Tesla (TSLA):

{'code': 'TSLA.US',
 'timestamp': 1693941000,
 'gmtoffset': 0,
 'open': 245,
 'high': 258,
 'low': 244.86,
 'close': 257.44,
 'volume': 112942754,
 'previousClose': 245.01,
 'change': 12.43,
 'change_p': 5.0733}

This output provides key information necessary to build a successful investment strategy. Now, let’s explore how to create a model that can help simplify your financial decision-making.

Python Implementation

1. Importing the Necessary Packages

To begin, import the essential Python libraries that will support your stock market assistant project. These packages will enable data manipulation, training models, and integrating the stock price API.

import pandas as pd
from eodhd import APIClient
import numpy as np
from xgboost import XGBClassifier
from sklearn.ensemble import IsolationForest
from sklearn.ensemble import RandomForestClassifier

Explanation of the Packages:

Pandas: A versatile library for handling data operations, such as reading and manipulating tabular data.
Numpy: A library for performing mathematical and logical operations on large, multi-dimensional arrays and matrices.
XGBoost, Random Forest, and Isolation Forest: Popular classification models used for training machine learning algorithms to identify trends and patterns in stock data.
eodhd: The official Python library for accessing financial data APIs provided by EODHD.

2. API Key Activation

In order to use the EODHD API and access live stock data, you’ll need to activate your API key. If you don’t have an API key yet, you can register on the EODHD APIs website and retrieve your secret key from the Settings page.

Here’s the code to activate your API key:

api_key = '<YOUR API KEY>'
client = APIClient(api_key)

Explanation:

api_key: Replace '<YOUR API KEY>' with your personal API key from EODHD APIs.
client: This is the instance of the APIClient class from the eodhd package, which you will use to access API functionalities.

Important: Be careful with your API key. It’s sensitive information, so make sure it’s not exposed in your code or shared publicly. For added security, you can use environmental variables to store the key instead of hardcoding it in your script.

3. Loading Historical Data

To begin the process, you’ll need to load historical stock data for the relevant period. This historical data will serve as the foundation for training your machine learning model. Using the EODHD’s Intraday API, you can easily extract split-adjusted historical stock data for any stock.

def get_historical_data(ticker, start_date, end_date):
    json_resp = client.get_historical_data(symbol = ticker, period = '5m', from_date = start_date, to_date = end_date, order = 'a')
    df = pd.DataFrame(json_resp)
    df = df.set_index('date')
    df.index = pd.to_datetime(df.index)
    return df

TSLA = get_historical_data('TSLA', '2021-08-02', '2021-09-02')

In this code, the get_historical_data function retrieves the historical stock data for Tesla over a specific date range. The parameters used include:

ticker: The stock symbol (e.g., ‘TSLA’ for Tesla).
period: Time interval between data points (e.g., 5 minutes).
from_date and to_date: The date range for the data, using the format “YYYY-MM-DD”.
order: Specifies the order of the data (ascending ‘a’ or descending ‘d’).

4. Obtaining Live Data for Prediction

For real-time decision-making, you can access live stock prices using the Live Stock Prices API. This helps you predict the stock’s current behavior based on live market data.

def extract_intraday(symbol):
    raw_df = client.get_live_stock_prices(ticker = symbol)
    df = pd.DataFrame([raw_df])
    return df

tsla_intraday = extract_intraday('TSLA')

The extract_intraday function retrieves live stock data, converts it into a Pandas DataFrame, and returns it. This data is essential for making real-time predictions on stock trends.

5. Preprocessing the Data

Before building and training a predictive model, you need to preprocess the data to ensure it is ready for analysis. Here’s what to consider:

Class Imbalance: If there is a disproportionate number of data points for certain outcomes (e.g., more “buy” signals than “sell” ones), it can bias your model. You can correct this through:
Oversampling: Increase the number of data points in the underrepresented class.
Undersampling: Reduce the number of data points in the overrepresented class.
Normalization: It’s important to ensure that all features are on the same scale. Some models perform better with normalized input features. Common scaling techniques include:
Standard Scaler: Scales the data to have a mean of 0 and a standard deviation of 1.
MinMax Scaler: Scales the data to a range, typically between 0 and 1.
Dropping Irrelevant Columns: You may need to remove columns that are not useful for predictions to simplify your model.

Here’s how to implement preprocessing:

dataF = dataF.drop(['timestamp',  'gmtoffset',  'datetime'],axis =1)
tsla_intraday = tsla_intraday.drop(['code', 'timestamp', 'gmtoffset', 'previousClose', 'change', 'change_p'], axis=1)

In this snippet, we remove unnecessary columns like ‘timestamp’ and ‘gmtoffset’ from the datasets, leaving only the relevant information to improve model performance.

6. Forming a Strategy

To enable your model to classify data and make predictions, you need to define a strategy. In my case, I developed a basic strategy with three categories: “waiting” (0), “buying” (1), and “selling” (2). This helps the model understand the decision-making process for each scenario.

Here’s how I implemented the strategy:

def signal_generator(df):
    open = df.Open.iloc[-1]
    close = df.Close.iloc[-1]
    previous_open = df.Open.iloc[-2]
    previous_close = df.Close.iloc[-2]

    if (open > close and previous_open < previous_close and close < previous_open and open >= previous_close):
        return 1  # Buying
    elif (open < close and previous_open > previous_close and close > previous_open and open <= previous_close):
        return 2  # Selling
    else:
        return 0  # Waiting

signal = [0]  # Initialize with "waiting"
for i in range(1, len(dataF)):
    df = dataF[i - 1:i + 1]
    signal.append(signal_generator(df))

dataF["signal"] = signal

The function checks current and previous stock prices to generate a signal: buying, selling, or waiting. It is then applied across the dataset to classify the data for each period.

7. Loading and Training the Models

Next, you’ll select and train your machine learning models. I used three different models: XGBoost, Isolated Random Forest, and Random Forest, each chosen for their unique capabilities in handling stock data.

XGBoost: Known for its performance with structured data like stock prices, it’s a powerful boosting algorithm.

from xgboost import XGBClassifier
model1 = XGBClassifier()
model1.fit(X_train, y_train)

Isolated Random Forest: This model is particularly good at identifying anomalies using binary trees.

from sklearn.ensemble import IsolationForest

random_state = np.random.RandomState(42)
model=IsolationForest(n_estimators=100,max_samples='auto',contamination=float(0.2),random_state=random_state)

model.fit(X_train, y_train)

Random Forest: A robust machine-learning algorithm that uses multiple decision trees for accurate classifications.

from sklearn.ensemble import RandomForestClassifier

model1 = RandomForestClassifier()
model1.fit(X_train, y_train)

8. Prediction on Live Data for Suggestions

Once your models are trained, you can now make predictions on both the test data and live data. To evaluate the model’s performance, use metrics such as precision, recall, accuracy, and F1 score.

Here’s how to make predictions and evaluate them:

# Make predictions for test data
y_pred1 = model1.predict(X_test)
predictions1 = [round(value) for value in y_pred1]

# Evaluate the predictions
from sklearn.metrics import confusion_matrix, recall_score, precision_score, f1_score, accuracy_score
cm = confusion_matrix(y_test, predictions)

rf_Recall = recall_score(y_test, predictions1, average='macro')
rf_Precision = precision_score(y_test, predictions1, average='macro')
rf_f1 = f1_score(y_test, predictions1, average='macro')
rf_accuracy = accuracy_score(y_test, predictions1)

The F1 score is particularly useful when working with imbalanced datasets, as it balances precision and recall to provide a more comprehensive view of your model’s performance.

Where to Go from Here

Now that you’ve set up your strategy, trained your models, and evaluated your predictions, you have the tools to enhance your stock trading decisions. From here, you can experiment with tuning the models, trying other machine learning algorithms, and testing the approach on various stocks. With live stock data from the Live (Delayed) Stock Prices API and the steps outlined here, you’re well on your way to making smarter, data-driven trading decisions.

Stay tuned for more valuable articles that aim to enhance your data science skills and market analysis capabilities.

The original article was published in the EODHD Academy by Nikhil Adithyan.

Please note that this article is for informational purposes only and should not be taken as financial advice. We do not bear responsibility for any trading decisions made based on the content of this article. Readers are advised to conduct their research or consult with a qualified financial professional before making any investment decisions.

For those eager to delve deeper into such insightful articles and broaden their understanding of different strategies in financial markets, we invite you to follow our account and subscribe for email notifications.