How to Create a Stock Market Heatmap in Python

Explore the relationship between stock tickers in a bullish or bearish phase and their equivalence.

EODHD APIs
6 min readApr 12, 2024

This piece will guide you through crafting a stock market heatmap using Python. We aim to explore the relationship between stock tickers in either a Bullish or Bearish phase and their respective Returns. To gauge the market’s mood, we’ll employ the SMA50/SMA200 indicator. A Bullish state is indicated when SMA50 surpasses SMA200, while the opposite signals a Bearish stance. Return is deduced by taking the difference between the current and preceding closing prices, dividing this by the earlier closing price, and then amplifying it by a factor of 100.

Gathering and Preprocessing our data

EODHD API will supply the necessary historical data for our project. Now, which stock market tickers should we select? Considering the S&P 500 index, which encompasses slightly more than 500 companies due to certain entities having multiple share types, we find our answer. There is a list of these companies on Wikipedia. Leveraging Python, we’ve developed a script to scrape this page, allowing us to extract the company codes and compile them into a list.

import requests
import pandas as pd
from io import StringIO

def get_sp500_tickers():
url = "https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
html = requests.get(url).text
data = StringIO(html)
df = pd.read_html(data, header=0)[0]
tickers = df["Symbol"].tolist()
return tickers

sp500_tickers = get_sp500_tickers()
print(sp500_tickers)

Armed with our list of company codes from the S&P 500, we next aim to cross-reference it with the list of US markets available on EODHD API to ensure coverage of all necessary markets. Our objective is to produce the output in the form of a Python list. In the forthcoming code snippet, we’re importing our API_KEY from config.py.

import config as cfg
from eodhd import APIClient

api = APIClient(cfg.API_KEY)

df_symbols = api.get_exchange_symbols("US")
eodhd_tickers = df_symbols["Code"].tolist()
print(eodhd_tickers)

Now that we have the two lists we want to do a quick check to make sure it all looks fine to proceed.

for sp500_ticker in sp500_tickers:
if sp500_ticker not in eodhd_tickers:
print("Missing: ", sp500_ticker)
else:
print("Found: ", sp500_ticker)

You might observe that a few tickers are missing; however, this is inconsequential as they primarily pertain to the “B” shares of companies listed more than once in the S&P 500 index. These can be disregarded. The crucial result is securing our list of 500 companies, which we have successfully achieved.

The next step involves expanding the code to include the “Found” companies, fetching their historical data limited to the latest 200 daily candles. We’ll apply SMA50/SMA200 to this data to ascertain the Bullish or Bearish state and calculate the Return, as initially described. For each company processed, we’ll append the data to a file named “sp500.csv”. Begin by creating this file empty and manually adding the heading: “Stock Symbol,Date,Closing Price,Market State,Return”.

def calculate_technical_indicators(df):
df_ta = df.copy()
df_ta["sma50"] = df_ta["close"].rolling(50, min_periods=0).mean()
df_ta["sma200"] = df_ta["close"].rolling(200, min_periods=0).mean()
df_ta["bull"] = df_ta.sma50 > df_ta.sma200
return df_ta

for sp500_ticker in sp500_tickers:
if sp500_ticker not in eodhd_tickers:
print("Missing: ", sp500_ticker)
else:
print("Found: ", sp500_ticker)

df_ticker_history = api.get_historical_data(
f"{sp500_ticker}.US", "d", results=200
)
df_ticker_history_ta = calculate_technical_indicators(df_ticker_history)
df_ticker_history_ta["date"] = df_ticker_history_ta.index

df_result = df_ticker_history_ta[["symbol", "date", "close", "bull"]].copy()
df_result.columns = ["Stock Symbol", "Date", "Closing Price", "Market State"]

df_result.loc[:, "Return"] = (
(df_result["Closing Price"] - df_result["Closing Price"].shift(1))
/ df_result["Closing Price"].shift(1)
) * 100

df_result.loc[:, "Market State"] = df_result["Market State"].apply(
lambda x: "Bullish" if x else "Bearish"
)

df_result.loc[:, "Return"] = df_result["Return"].fillna(0)
df_result.loc[:, "Stock Symbol"] = df_result["Stock Symbol"].str.replace(".US", "")

print(df_result.to_csv("sp500.csv", mode="a", index=False, header=False))

Upon completion, we will have a “sp500.csv” file formatted as follows…

Stock Symbol,Date,Closing Price, Market State, Return
MMM,2023-06-02,102.53,Bearish,0.0
MMM,2023-06-05,97.98,Bearish,-4.437725543743292
MMM,2023-06-06,98.29,Bearish,0.3163911002245379
MMM,2023-06-07,101.0,Bearish,2.7571472174178386
...

Our file contains 68,558 lines, header included.

Correlation Matrix

To craft our stock market heatmap, we’ll first import the data into a Pandas DataFrame and construct a pivot table. This table will serve as the foundation for our correlation matrix.

df = pd.read_csv("sp500.csv")
print(df)

pivoted_df = df.pivot(index="Date", columns="Stock Symbol", values="Closing Price")
print(pivoted_df)

corr_matrix = pivoted_df.corr()

Stock Market Heatmaps

Before proceeding, a word of caution: crafting a correlation matrix of this magnitude will result in a rather unwieldy output. Straight “out the box”, it’ll likely be challenging to interpret. Nonetheless, it’ll serve as our baseline. I’ll guide you through creating a fundamental stock market heatmap. Ensure you’ve installed the “seaborn” Python library beforehand if it’s not already in your toolkit.

import seaborn as sns

plt.figure(figsize=(16, 16))
sns.heatmap(corr_matrix, annot=True, cmap="coolwarm")
plt.title("Heatmap of Stock Price Correlations")
plt.savefig("heatmap-full.png")
plt.tight_layout()
plt.show()

There are several strategies to enhance the heatmap’s readability. A primary approach involves setting a correlation threshold, such as 0.8, though experimentation may be necessary to discover the optimal balance.

threshold = 0.8
high_corr = corr_matrix[(corr_matrix >= threshold) & (corr_matrix < 1)]

plt.figure(figsize=(16, 16))
sns.heatmap(high_corr, annot=True, cmap="coolwarm")
plt.title("Heatmap of Stock Price Correlations")
plt.savefig("heatmap-0.8.png")
plt.tight_layout()
plt.show()

From the image, it’s challenging to discern, but there is a slight improvement. However, the enhancement isn’t significant enough to justify further exploration of this method. While adjusting the threshold might yield additional improvements, it’s advisable to consider the next option for better results.

Clustermap

A cluster map offers an alternative to the heat map and visually appears much more appealing.

plt.figure(figsize=(16, 16))
sns.clustermap(corr_matrix, cmap="coolwarm")
plt.title("Cluster Map of Stock Price Correlations")
plt.savefig("clustermap.png")
plt.tight_layout()
plt.show()

This displays markets with high correlation in warm red tones and those with lower correlation in cooler blue hues.

Top n Stock Market Heatmap

An alternative heatmap approach would involve plotting the top 50.

def get_top_n_correlations(corr_matrix, n):
c = corr_matrix.abs().stack()
return c[c < 1].sort_values(ascending=False).drop_duplicates().head(n)

top_n = get_top_n_correlations(corr_matrix, 50)

top_pairs = pd.DataFrame(list(top_n.index), columns=["Stock 1", "Stock 2"])
top_pairs["Correlation"] = top_n.values

top_pairs_pivot = top_pairs.pivot(
index="Stock 1", columns="Stock 2", values="Correlation"
)

plt.figure(figsize=(16, 16))
sns.heatmap(top_pairs_pivot, annot=True, cmap="coolwarm")
plt.title("Cluster Map of Stock Price Correlations")
plt.savefig("top_n_heatmap.png")
plt.tight_layout()
plt.show()

This representation proves to be the most straightforward to read and interpret.

Stock Market Heatmap by Industry

When we delve into identifying correlations across numerous stock market tickers, the process can quickly become overwhelming. Our favoured strategy involves organising the markets by Sector to analyse correlations within each Sector more effectively.

On the Wikipedia page we mentioned earlier, we utilised the “Symbol” to compile a list of stock market tickers in the S&P 500. You might have observed two additional columns, “GICS Sector” and “GICS Sub-Industry”. The Python code below will generate a mapping dictionary to associate a “Symbol” with its “Sector”.

def sp500_sector_mapping():
url = "https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
html = requests.get(url).text
data = StringIO(html)
df = pd.read_html(data, header=0)[0]
ticker_meta = df[[
"Symbol",
"GICS Sector"
# "GICS Sub-Industry"
]].to_dict(orient='records')

mapping_dict = {}
for ticker in ticker_meta:
mapping_dict[ticker["Symbol"]] = ticker["GICS Sector"]

return mapping_dict

sector_mapping = sp500_sector_mapping()

Utilizing this code, we will then map the correlation between sectors, presented in the form of a stock market heatmap.

mapped_sectors = corr_matrix.columns.map(sector_mapping)
mean_corr_by_sector = (
corr_matrix.groupby(mapped_sectors).mean().groupby(mapped_sectors, axis=1).mean()
)

plt.figure(figsize=(16, 16))
sns.heatmap(mean_corr_by_sector, annot=True, cmap="coolwarm")
plt.title("Heatmap of Stock Price Sector Correlations")
plt.savefig("heatmap_sector.png")
plt.tight_layout()
plt.show()

Conclusion

We’ve presented various methods to illustrate stock market heatmaps in Python, including an introduction to website data scraping and cluster maps.

For those eager to delve deeper into such insightful analyses and broaden their understanding of Python’s application in financial markets, we invite you to follow our account. Stay tuned for more valuable articles that aim to enhance your data science skills and market analysis capabilities.

For those eager to delve deeper into such insightful articles and broaden their understanding of different strategies in financial markets, we invite you to follow our account and subscribe for email notifications.

We publish 2 pieces per week!

Stay tuned for more valuable articles that aim to enhance your data science skills and market analysis capabilities.

--

--

EODHD APIs
EODHD APIs

Written by EODHD APIs

eodhd.com — stock market fundamental and historical prices API for stocks, ETFs, mutual funds and bonds all over the world.