Exploring Volatility Patterns Using GARCH Analysis
Analyze volatility patterns in financial data, enhancing predictions and risk management through advanced time series techniques.
This article delves into the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, a vital tool for identifying volatility clustering in time series data. Using NVIDIA stock returns as an example, we examine the role of volatility — the pulse of financial markets — in enhancing forecasting accuracy. The discussion begins with an introduction to the GARCH model’s structure and principles. We then walk through its implementation, from acquiring data to interpreting outcomes, offering a comprehensive guide to understanding and applying GARCH to dynamic financial time series analysis.
Data Preparation for GARCH Analysis
To retrieve the necessary data through the EODHD APIs, start by downloading and importing the EOD Python package. Authenticate using your personal API key, which is best stored in an environment variable for security. For this analysis, we’ll utilize the Historical Data API, a feature included in various subscription plans (time depth ranges from 1 to 30 years, depending on the plan).
# Install the EOD Python package
pip install eod
import os
# load the key from the environment variables
api_key = os.environ['API_EOD']
import eod
client = EodHistoricalData(api_key)
This setup allows seamless access to the historical data needed for implementing the GARCH model.
Understanding the Basics of the GARCH Model
The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, developed by Tim Bollerslev in 1986, is a powerful tool for analyzing financial time series that exhibit heteroskedasticity — variances that change over time. It builds upon the foundational Autoregressive Conditional Heteroskedasticity (ARCH) model introduced by Robert Engle in 1982.
GARCH(p, q) Model Equation
The conditional variance at time is modeled as:
Expanded, it appears as:
Component Breakdown:
The GARCH model suggests that conditional variance at any given moment is determined by past squared observations and previous conditional variances.
To analyze market volatility fluctuations, researchers use maximum likelihood estimation (MLE) to determine GARCH model parameters. This method identifies the parameters that most accurately align the model with observed market data, often assuming market returns follow a normal distribution. When estimated correctly, these parameters make the GARCH model a powerful tool for forecasting future market volatility, enabling market participants to anticipate risk changes and make more informed decisions.
Application
Step 1 — Checking for ARMA Structure
Deciding whether to directly implement a GARCH model or first test for an ARMA model depends on the data’s specific traits and your analysis goals. Testing for an ARMA model might be advantageous if the time series exhibits clear patterns in the mean that should be captured. For a more detailed explanation, refer to our previous article on this topic.
Step 2 — Testing for ARCH Effects
Based on the need for an ARMA model, you can either analyze the residuals of the ARMA regression or work directly with the data, ensuring it is stationary.
- With ARMA residuals: Square the residuals to test for conditional heteroskedasticity, which indicates time-varying volatility in the series.
- Without ARMA residuals: Simply square the data and proceed.
Use the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) for initial insights (check out here for unrestanding). Note that in ARCH effect testing, the orders of p and q are reversed compared to ARMA modeling. Here, p is identified using ACF, and qqq with PACF. While not the most rigorous test, this method helps determine the lag order for each term. For financial data, a simple GARCH(1,1) model (with p=1 and q=1) often suffices.
ARCH-LM Test
A more robust method to detect ARCH effects is the ARCH-LM Test, which can be implemented in Python as shown below:
from arch import arch_model
model = arch_model(data2) #Squared residuals from ARMA or data squared
result = model.fit()
arch_test = result.arch_lm_test(lags = 12)
print(arch_test)
This approach offers a more reliable way to identify ARCH effects. To illustrate the utility of the ARCH-LM test, we’ll apply it to a real-world dataset in the following example.
Step 3 — Implementing the GARCH Model
After identifying ARCH effects and determining the appropriate lag orders, the next step is to build the GARCH model. To proceed, define the mean as zero, constant (allowing it to be estimated from the data), or autoregressive if the residual follows an AR process. Then, specify the orders of p and q to construct the Generalized Autoregressive Conditional Heteroskedasticity model.
Step 4 — Analyzing the Results
The final step in implementing the GARCH model involves examining the residuals from the regression to determine if the model has successfully captured all the relevant information in the time series, leaving only white noise. This involves repeating the diagnostic checks outlined in Step 2. If the residuals are white noise, the ACF and PACF plots should exhibit no significant lags.
To formally test for the absence of autocorrelation, the Ljung-Box test can be utilized. Below is the Python function to perform this test:
def Ljung(series):
result_df = acorr_ljungbox(series, lags=list(range(1, 21)), return_df=True)
print(result_df)
This function evaluates the residuals for autocorrelation and provides a detailed summary of the results.
GARCH Example: NVIDIA Stock
This analysis builds upon the previous example provided in the ARIMA Analysis on Stock Returns article.
Data Retrieval
To conduct the analysis, we utilize the Official EOD Python API library to gather historical stock data for NVIDIA, dating back to March 2014. Stock data is retrieved using the get_historical_data
function:
start = "2014-03-01"
prices = api.get_historical_data("NVDA", interval="d", iso8601_start=start,
iso8601_end="2024-03-01")["adjusted_close"].astype(float)
prices.name = "NVDA"
returns = np.log(prices[1:].values) - np.log(prices[0:-1].values)
Analyzing Residuals
Once the AR(1) model is estimated, the next step is to test for ARCH effects on the squared residuals from the regression. It’s essential to identify p using the ACF and q using the PACF, unlike the approach used in ARMA models. In financial data, ARCH models often require a large number of significant lags for p, highlighting the importance of the “Generalized” component in the GARCH model.
ACF and PACF Analysis
Analyzing the ACF and PACF of the residuals:
- ACF: Displays geometric decay, indicating the presence of autocorrelation in squared residuals.
- PACF: Shows significant lags up to lag 3.
Both plots reveal relevant significance up to lag 4 for ACF and lag 3 for PACF.
Since we are working with financial returns, a GARCH(1,1) model is commonly chosen as it strikes a balance between parsimony and model fit.
ARCH-LM Test
To perform a more rigorous test, the provided code using squared residuals produces the following output:
The null hypothesis states that the residuals are homoscedastic, indicating constant variance. In contrast, the alternative hypothesis suggests that the residuals exhibit volatility clustering, indicating heteroscedasticity.
With a p-value of 0, the null hypothesis is rejected at any significance level based on the one-tailed Chi-squared distribution. This confirms that the residuals display ARCH effects, as observed in the ACF and PACF plots.
AR(1)-GARCH(1,1) Model
To construct a model from our observations, use the following code snippet:
ARCH_model = arch_model(returns, mean = 'AR', lags = 1, vol='Garch', p=1, q=1)
result = ARCH_model.fit()
print(result.summary())
result.plot()
plt.show()
Here, the actual returns are used as input, no longer relying on squared residuals, as the objective is to model observations rather than test the dataset. Running the regression yields the following results:
Regression Insights
From the output, we observe that all coefficients are statistically significant, leading to the formulation of the following models:
Analyzing Results
In addition to examining significant coefficients (and a high R-squared in the case of ARMA models), it is essential to analyze the residuals when modeling a time series. This ensures that all relevant information has been captured and that the residuals contain no additional significant insights.
On the first screenshot, we observe the residuals from the AR(1) model. A quick glance reveals clear volatility clustering: periods of high volatility tend to follow other high-volatility periods, while periods of low volatility exhibit a similar continuation. The standardized residuals on the second screenshot, derived from the AR(1)-GARCH(1,1) model, successfully capture this clustering, resulting in more homoscedastic residuals. To complete our analysis, we apply the Ljung-Box test to ensure that the residuals from the final model are white noise.
The Ljung-Box test is employed to evaluate whether there is significant autocorrelation in a time series’ residuals, an essential step in verifying the adequacy of a fitted model. The null hypothesis posits no autocorrelation, implying that the residuals are independently and identically distributed (i.i.d.). Conversely, the alternative hypothesis suggests the presence of autocorrelation, indicating that the model may require further refinement to address remaining temporal dependencies in the data.
In the test results, all p-values exceed common significance levels, meaning we fail to reject the null hypothesis. This outcome confirms that the residuals are white noise, signifying a well-fitted model.
Final Thoughts
This article has outlined the key steps to applying the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model for analyzing and modeling volatility dynamics in financial time series data. The GARCH model effectively captures time-varying volatility in financial returns by assessing the influence of past squared errors on current volatility. This approach enhances the understanding and forecasting of volatility clustering, a critical element in risk management and financial modeling.