The previous notebook showed deteriorating performance in the example pair GLD-GDX over time. In this notebook we aim to mitigate the problem of deteriorating performance by exploring a more dynamic approach to identifying and selecting pairs.
We follow a 4-step process:
Setting the following variables appropriately will allow the remainder of the notebook to be adapted to different universes, liquidity filters, or date ranges.
DB = "usstock-1d"
UNIVERSE = "us-etf"
MIN_DOLLAR_VOLUME = 80e6 # $80M USD
COINTEGRATION_CONFIDENCE_LEVEL = 90 # require cointegration at 90%, 95%, or 99% confidence
COINTEGRATION_START_DATE = "2011-01-01" # conintegration test starts here...
COINTEGRATION_END_DATE = "2011-12-31" # ...and ends here
IN_SAMPLE_END_DATE = "2015-12-31" # in-sample backtest starts at the cointegration end date and ends here
First we filter the universe of ETFs to include only securities having average dollar volume of at least $80M USD in the in-sample period.
from quantrocket import get_prices
prices = get_prices(DB, universes=UNIVERSE, start_date=COINTEGRATION_START_DATE, end_date=COINTEGRATION_END_DATE, fields=["Close", "Volume"])
closes = prices.loc["Close"]
volumes = prices.loc["Volume"]
dollar_volumes = (closes * volumes).mean()
adequate_dollar_volumes = dollar_volumes[dollar_volumes >= MIN_DOLLAR_VOLUME]
print(f"{len(adequate_dollar_volumes.index)} of {len(closes.columns)} have dollar volume > {MIN_DOLLAR_VOLUME}")
90 of 1473 have dollar volume > 80000000.0
117 ETFs meet our threshold.
Next, we combine the liquid ETFs into all possible pairs:
import itertools
all_pairs = list(itertools.combinations(adequate_dollar_volumes.index, 2))
print(f"formed {len(all_pairs)} total pairs")
formed 4005 total pairs
This results in 4,000 combinations.
Then we use the coint_johansen
function from the statsmodels
library to identify pairs that cointegrate with at least 90% confidence:
In the
get_hedge_ratio
method in the Moonshot strategy, we used the Johansen test to obtain hedge ratios but ignored the test statistics and critical values (i.e. we don't actually test for cointegration in the pairs backtest itself). Here, we do the opposite: we don't need the hedge ratios but rather check the test statistics against the critical values to determine if there is cointegration.
from statsmodels.tsa.vector_ar.vecm import coint_johansen
import numpy as np
from IPython.display import clear_output
cointegrating_pairs = []
for i, (sid_1, sid_2) in enumerate(all_pairs):
# Display progress
clear_output(wait=True)
print(f"Running Johansen test on pair {i} of {len(all_pairs)}")
pair_closes = closes[[sid_1, sid_2]].dropna()
# Skip pairs with less than 90 non-null obversations
if len(pair_closes) < 90:
continue
# The second and third parameters indicate constant term, with a lag of 1.
# See Chan, Algorithmic Trading, chapter 2.
result = coint_johansen(pair_closes, 0, 1)
# the 90%, 95%, and 99% confidence levels for the trace statistic and maximum
# eigenvalue statistic are stored in the first, second, and third column of
# cvt and cvm, respectively
confidence_level_cols = {
90: 0,
95: 1,
99: 2
}
confidence_level_col = confidence_level_cols[COINTEGRATION_CONFIDENCE_LEVEL]
trace_crit_value = result.cvt[:, confidence_level_col]
eigen_crit_value = result.cvm[:, confidence_level_col]
# The trace statistic and maximum eigenvalue statistic are stored in lr1 and lr2;
# see if they exceeded the confidence threshold
if np.all(result.lr1 >= trace_crit_value) and np.all(result.lr2 >= eigen_crit_value):
cointegrating_pairs.append(dict(
sid_1=sid_1,
sid_2=sid_2
))
clear_output()
We find there are 81 cointegrating pairs:
len(cointegrating_pairs)
81
Having identified all cointegrating pairs, the next step is to run an in-sample backtest on each cointegrating pair. The in-sample backtest period is subsequent to the cointegration test period, but we still consider it in-sample because we will use the performance results from the in-sample backtests to select a portfolio of pairs for out-of-sample testing.
First, we download symbols and names from the securities master database to help us know what we're testing:
from quantrocket.master import download_master_file
import io
import pandas as pd
f = io.StringIO()
download_master_file(f, sids=list(dollar_volumes.index), fields=["Symbol","Name"])
securities = pd.read_csv(f, index_col="Sid")
# Convert to dict of {<sid>: {Symbol: <symbol>, Name: <name>}}
securities = securities.to_dict(orient="index")
To run backtests on so many pairs, we take advantage of Moonshot's ability to set strategy parameters dynamically using the params
argument. We use Moonchart to calculate the Sharpe ratios of each pair strategy:
from quantrocket.moonshot import backtest
from moonchart import DailyPerformance
all_results = []
for i, pair in enumerate(cointegrating_pairs):
sid_1 = pair["sid_1"]
sid_2 = pair["sid_2"]
security_1 = securities[sid_1]
security_2 = securities[sid_2]
symbol_1 = security_1["Symbol"]
symbol_2 = security_2["Symbol"]
name_1 = security_1["Name"]
name_2 = security_2["Name"]
# Display progress
clear_output(wait=True)
print(f"Backtesting pair {i+1} of {len(cointegrating_pairs)}: {symbol_1}/{symbol_2} ({name_1} and {name_2})")
f = io.StringIO()
# Run backtest
backtest("pairs", start_date=COINTEGRATION_END_DATE,
end_date=IN_SAMPLE_END_DATE,
params={
"DB": DB,
"SIDS": [sid_1, sid_2]},
filepath_or_buffer=f)
# Get Sharpe and CAGR
perf = DailyPerformance.from_moonshot_csv(f)
sharpe = perf.sharpe.iloc[0]
cagr = perf.cagr.iloc[0]
all_results.append({
"sid_1": sid_1,
"sid_2": sid_2,
"symbol_1": symbol_1,
"symbol_2": symbol_2,
"name_1": name_1,
"name_2": name_2,
"sharpe": sharpe,
"cagr": cagr,
})
clear_output()
results = pd.DataFrame(all_results)
We sort by Sharpe ratio, and show the best 5 performers:
results = results.sort_values("sharpe", ascending=False)
best_pairs = results.head(5)
best_pairs
sid_1 | sid_2 | symbol_1 | symbol_2 | name_1 | name_2 | sharpe | cagr | |
---|---|---|---|---|---|---|---|---|
31 | FIBBG000BJ7007 | FIBBG000CSVPZ6 | XLK | UCO | TECHNOLOGY SELECT SECT SPDR | PROSHARES ULTRA BLOOMBERG CR | 1.087876 | 0.358299 |
76 | FIBBG000PLNQN7 | FIBBG000PT7GJ5 | GDX | QID | VANECK GOLD MINERS | PROSHARES ULTRASHORT QQQ | 0.784192 | 0.184588 |
38 | FIBBG000BK42M9 | FIBBG000KJ4073 | EWW | MOO | ISHARES MSCI MEXICO ETF | VANECK AGRIBUSINESS | 0.760353 | 0.075929 |
34 | FIBBG000BJ7007 | FIBBG000PLNQN7 | XLK | GDX | TECHNOLOGY SELECT SECT SPDR | VANECK GOLD MINERS | 0.746456 | 0.123332 |
36 | FIBBG000BJ7007 | FIBBG000QXGKF0 | XLK | SKF | TECHNOLOGY SELECT SECT SPDR | PROSHARES ULTSHRT FINANCIALS | 0.687180 | 0.107301 |
Having found the 5 best performing pairs, we create a Moonshot strategy for each pair and run an out-of-sample backtest on the portfolio of pairs.
In order to avoid lots of typing, we use the code below to print out the Moonshot subclasses, which we can then copy and paste into pairs.py
:
# Save the strategy codes to use in the subsequent backtest
best_pairs_codes = []
# Print the subclass definitions
for i, pair in best_pairs.iterrows():
strategy_code = f"pair-{pair.symbol_1.lower()}-{pair.symbol_2.lower()}"
subclass_code = f"""
class {pair.symbol_1}_{pair.symbol_2}_Pair(PairsStrategy):
CODE = "{strategy_code}"
DB = "{DB}"
SIDS = [
"{pair.sid_1}", # {pair.symbol_1}
"{pair.sid_2}" # {pair.symbol_2}
]"""
print(subclass_code)
best_pairs_codes.append(strategy_code)
class XLK_UCO_Pair(PairsStrategy): CODE = "pair-xlk-uco" DB = "usstock-1d" SIDS = [ "FIBBG000BJ7007", # XLK "FIBBG000CSVPZ6" # UCO ] class GDX_QID_Pair(PairsStrategy): CODE = "pair-gdx-qid" DB = "usstock-1d" SIDS = [ "FIBBG000PLNQN7", # GDX "FIBBG000PT7GJ5" # QID ] class EWW_MOO_Pair(PairsStrategy): CODE = "pair-eww-moo" DB = "usstock-1d" SIDS = [ "FIBBG000BK42M9", # EWW "FIBBG000KJ4073" # MOO ] class XLK_GDX_Pair(PairsStrategy): CODE = "pair-xlk-gdx" DB = "usstock-1d" SIDS = [ "FIBBG000BJ7007", # XLK "FIBBG000PLNQN7" # GDX ] class XLK_SKF_Pair(PairsStrategy): CODE = "pair-xlk-skf" DB = "usstock-1d" SIDS = [ "FIBBG000BJ7007", # XLK "FIBBG000QXGKF0" # SKF ]
Having copied the above code into pairs.py
, we are ready to run the out-of-sample backtest:
backtest(best_pairs_codes, start_date=IN_SAMPLE_END_DATE, filepath_or_buffer="best_pairs_results.csv")
The tear sheet shows good performance in the first two years out of sample followed by deteriorating performance, perhaps indicating that we need to re-run cointegration tests and in-sample backtests on a forward basis and update our portfolio of best pairs every year or two.
from moonchart import Tearsheet
Tearsheet.from_moonshot_csv("best_pairs_results.csv")