СОВЕРШЕНСТВОВАНИЕ ТОРГОВЫХ МОДЕЛЕЙ СТАТИСТИЧЕСКОГО АРБИТРАЖА ПРИ ГИПОТЕЗЕ АДАПТИВНЫХ РЫНКОВ НА ПРИМЕРЕ РЫНКОВ ФЬЮЧЕРСОВ НА ОБЛИГАЦИИ РОССИИ И ИТАЛИИ
Проскуряков И.М.
Аннотация
В нашем исследовании мы применили методы статистического арбитража на рынке фьючерсов на государственные облигации. Мы разработали торговую модель статистического арбитража с названием PR и оптимизировали ее параметры для разных пар фьючерсов на российские и итальянские государственные облигации, используя алгоритм дифференциальной эволюции. Наша балльная система оценки выявила превосходство PR над торговой моделью-бенчмарком Куинна. Предлагаемая торговая модель создавалась с учетом Гипотезы адаптивных рынков, и итоговые результаты доказывают данную Гипотезу.
Ключевые слова: Арбитраж долговых инструментов, парный трейдинг, статистический арбитраж, фьючерсы на облигации, Гипотеза адаптивных рынков
IMPROVEMENT OF STATISTICAL ARBITRAGE TRADING MODELS UNDER THE ADAPTIVE MARKETS HYPOTHESIS: THE CASE OF THE RUSSIAN AND ITALIAN BOND FUTURES MARKETS
Ivan Proskuryakov
Abstract
In our investigation we applied statistical arbitrage in the government bond futures market. We developed a statistical arbitrage trading model PR and optimized it using the Differential Evolution method for different pairs of the Italian BTP and the Russian OFZ government bond futures. Our scoring system showed dominance of the PR trading model over the benchmark Quinn’s model. The proposed trading model is developed considering the Adaptive Markets Hypothesis (AMH), and the results are consistent with the AMH.
keywords: Fixed income arbitrage, pairs trading, statistical arbitrage, bond futures, Adaptive Markets Hypothesis
Arbitrage is a profitable business in global hedge funds. There are two types of arbitrage in financial markets: pure arbitrage and statistical arbitrage. Pure arbitrage in theory assumes extracting risk-less profit. When two fungible assets are traded at different prices there can be a pure arbitrage opportunity.
Pure arbitrage exists rarely, however statistical arbitrage provides many attractive opportunities for an arbitrageur that can accept some risk. The idea of statistical arbitrage is that the arbitrageur finds two assets whose prices have historical interrelation, opens opposite positions in this pair when this interrelation is temporary interrupted and realize the profit when interrelation is restored.
Statistical arbitrage refers to an active portfolio management aimed to generate positive alpha systemically. However, under the Efficient Market Hypothesis (EMH), which was the prevalent financial theory for around fifty years, such approach is useless as it is impossible to beat an efficient market. [6]
The fundamental financial market theory was challenged in 2004 by Andrew Lo by his innovative theory, which he has named the Adaptive Markets Hypothesis (AMH). [5] The AMH considers financial market like the ecosystem, where different groups of investors or single investors compete for scarce resources, i.e. profits, and it takes place in a dynamically changing environment. The environment includes such factors as technologies, demography, policy, market path, etc. In this way the AMH applies the principles of evolution – competition, adaptation and natural selection to financial markets.
The evolutionary dynamics, which is assumed by the AMH, determines the variability of the market efficiency, i.e. the market is often inefficient, behavioral biases abound and they can be profitably exploited. [4]
Our research is based on the following implications of the AMH:
Inefficiencies exist in the market and abnormally profitable trading models are possible.
More complex trading models allow to get a competitive advantage and have a longer life cycle.
Innovations and adaptivity to changing market conditions are the key principles to engineer statistical arbitrage trading models with higher performance.
We created an improved statistical arbitrage trading model for the bond futures markets of Russia and Italy, which exceeded the performance of the analogical trading model published in 2018 by Quinn et al for the same types of financial instruments. [8] We got these results by using the author’s scoring system for comparative performance measurement of trading models.
Before starting the performance measurement of the statistical arbitrage trading models on the two investigated markets, we conducted a comparative analysis of the applicability of arbitrage in the relevant markets, using macro level parameters, each consists of 1 to 8 indicators (table 1).
Table 1 – Comparative analysis of two markets (November 2018)
Parameter |
Indicator |
Russia |
Italy |
Risk |
Volatility of stock market, % |
25,77 |
28,73 |
Volatility of bond market, % |
4,27 |
5,12 |
|
Credit rating (SnP) |
BBB- |
BBB |
|
Sovereign CDS, b.p. |
149,45 |
279,70 |
|
Mean corporate CDS, б.п. |
245,50 |
322,60 |
|
Volatility of national currency, % |
12,96 |
6,94 |
|
Inflation, % |
3,50 |
1,60 |
|
Deficit/surplus of gov. budget to GDP, % |
-1,50 |
-2,30 |
|
Liquidity |
Bond market size, bn. USD |
464 |
2945 |
Stock market size, bn. USD |
623,42 |
587,31 |
|
Average gov. bond issue size, mn. USD |
999 |
3176 |
|
Monetization of economy, % |
43 |
91 |
|
Capital of potential investors |
GDP per capita, USD |
10950 |
34349 |
Quantity of dollar millionaires |
189500 |
274200 |
|
Quantity of arbitrage capital |
Hedge Funds, quantity |
48 |
294 |
AUM of hedge funds, bn. USD |
3,43 |
3,60 |
|
Financial literacy of population |
Investigation by S&P, % of adults
|
38 |
37 |
Availability of financial instruments |
Quantity of instruments’ types |
11 |
17 |
Source: Bloomberg, [2,3]
At first sight it is preferable to do arbitrage in Italian market than in Russian. For 5 out of 8 indicators characterizing the parameter “Risk”, the Russian market outperforms the Italian. Russia also has small advantage regarding the “Financial literacy of population” parameter. However, the remaining 4 parameters (“Liquidity”, “Capital of potential investors”, “Quantity of arbitrage capital” and «Availability of financial instruments”) testify in favor of the Italian market.
Despite it we admit that the parameter “Liquidity” may not have so straightforward interpretation for arbitrageur. On the one hand high market liquidity reduces transaction costs when dealing with large orders. On the other hand, less liquidity and less homogeneity of liquidity among instruments usually increase statistical arbitrage opportunities, as we will see further.
We applied trading models on daily data which are represented by the Russian and the Italian markets since 2011 to 2018. For the Russian market we used the government bond (OFZ) basket futures pairs traded on Moscow exchange: a pair 4-year-maturity basket against 6-year-bascket (4-6), a pair 6-10 and a pair 2-16. Roll-adjusted series of futures prices were gathered from the official website of Moscow Exchange [7]. For the Italian market we used the BTP government bond basket futures: pairs 2-5, 5-10 and 2-10. For the 2-year- and 10-year basket futures we took the BNP Paribas BTP 2Y Rolling Future Index and the BNP Paribas BTP 10Y Rolling Future Index as a proxies of prices. For a 5-year basket future we calculated our own roll-adjusted price index based on historical price series of the available futures. Italian data obtained from the Bloomberg Terminal.
We considered maintenance margins based on the appropriate information from official websites of exchanges and brokerage firms to adjust trading returns so that they take into account the effect of financial leverage inherent to futures contracts.
As a benchmark we took a trading model of Quinn et al (2018) as it is the only academically published statistical arbitrage trading model for bond futures. [8]
Quinn’s model is based on a spread between normalized prices of two different bond basket futures, i.e. a normalized price of a future 1 minus a normalized price of a future 2 (further – spread). The Quinn’s model assumes calculation of mean of the mentioned spread. The rolling window width for calculation of normalized prices and the mean spread equals 35 days. Rules of Quinn’s model are the following:
Trigger levels. When spread deviates from its mean spread up (down) for n [optimizable parameter] percentage, then we short (long) one futures 1 and long (short) one futures 2.
The stop loss is stated at the level of 30% lower (higher) than a simple spread (calculated from raw prices) value at the time of opening the appropriate positions.
Positions are closed either when the spread achieves or crosses the mean spread or when the simple spread achieves the stop loss level.
We called our statistical arbitrage trading model the Price Ratio model (PR). The PR model has four optimizable parameters. The price ratio, which is a PR model core indicator, represents the ratio of prices of two futures, traded in pair. Other two indicators of PR model are calculated based on the price ratio. They are: the simple moving average of price ratio (SMA) for the last n_SMA days [the first optimizable parameter] and the standard deviation of price ratio (SD) for the last n_SD days [the second optimizable parameter].
The rules of PR trading model are the following:
Short (long) future in a numerator of the price ratio and at the same time long (short) future in a denominator of the price ratio if the price ratio is higher (lower) than SMA plus (minus) SD*n. [n is the third optimizable parameter].
Close all positions if the price ratio is not higher (lower) than SMA plus (minus) SD*n.
β coefficient of linear regression of the numerator future against the denominator future with a lookback period n_REG days [fourth optimizable parameter] is used to calculate the position size for each leg of a pair trade.
PR trading model is represented in table 2.
Table 2 – specification of PR trading model
Trigger |
Instrument |
Order |
Size |
P1/P2 < SMA(n_SMA)-SD(n_SD)*n |
F1 |
Buy |
1 |
F2 |
Sell |
1*β |
|
P1/P2 > SMA(n_SMA)+SD(n_SD)*n |
F1 |
Sell |
1 |
F2 |
Buy |
1*β |
|
SMA(n_SMA)-SD(n_SD)*n ≤ P1/P2 ≤ ≤ SMA(n_SMA)+SD(n_SD)*n |
F1 |
Close all positions |
All existing positions |
F2 |
Where:
P1 – price of the F1 futures;
P2 – price of the F2 futures;
SMA(n_SMA) – simple moving average of price ratio P1/P2 for n_SMA days;
SD(n_SD) – standard deviation of price ratio P1/P2 for last n_SD days;
β – coefficient of P2 in rolling OLS regression of P1 to P2.
We back-tested the performance of the PR and the Quinn’s trading models on daily price data of the government bond futures of Russian and Italian markets, using the first half of data for a parameter optimization (in-sample test) and the second half of data for out-of-sample test.
Parameter optimization for PR trading model was executed by the Differential Evolution (DE) method from the position of maximization of the annualized Sharpe ratio. DE belongs to class of genetic algorithms, which are based on evolutionary ideas of natural selection and genetics. Such algorithms represent an intelligent application of random search for optimization tasks. [1] We have chosen the DE algorithm of optimization, because other methods of non-linear multidimensional optimization (Nelder-Mead, BFGS, etc.) worked slower and produced the lower Sharpe ratio.
The parameter optimization for the Quinn’s trading model was made in accordance with methodology of its authors: training period (rolling window) for defining mean spread and stop-loss level were set and remained fixed (35 days and 30% accordingly), whereas trading model was tested for the three variants of trigger level (10%, 15% and 20%) and the best (one with a highest Sharpe ratio as well for consistency) trigger parameter was chosen for the out-of-sample test.
The fitted parameters of trading models are presented in the table 3 and the table 4.
Table 3 – Fitted parameters of the PR trading model
Russia |
Italy |
||||||
Pair |
4-6. |
6-10. |
2-15. |
2-5. |
5-10. |
2-10. |
|
1 |
n_SMA |
8,04 |
8,60 |
6,53 |
9,29 |
9,28 |
9,89 |
2 |
n_SD |
22,31 |
13,45 |
29,59 |
23,04 |
29,61 |
9,74 |
3 |
N |
0,13 |
0,30 |
0,58 |
1,69 |
0,20 |
0,17 |
4 |
n_REG |
69,69 |
80,52 |
97,80 |
96,25 |
88,47 |
39,19 |
Table 4 – Fitted parameters of Quinn’s trading model
Market |
Russia |
Italy |
||||
Pair |
“4-6” |
“6-10” |
“2-15” |
“2-5” |
“5-10” |
“2-10” |
Trigger, % |
15 |
15 |
10 |
15 |
20 |
20 |
We calculated important and widely used performance measures. For out-of-sample data they are provided for the PR (author’s) and the Quinn’s (benchmark) trading models in the Table 5 and the Table 6 accordingly.
Table 5 – Out-of-sample performance of PR trading model
Out-of-sample period, PR trading model |
||||||
Market |
Russia |
Italy |
||||
Pair |
“4-6” |
“6-10” |
“2-15” |
“2-5” |
“5-10” |
“2-10” |
Annualized return, % |
126 |
59 |
1 |
80 |
168 |
216 |
Annualized volatility, % |
37 |
44 |
51 |
79 |
92 |
301 |
Sharpe ratio |
3,44 |
1,32 |
0,02 |
1,02 |
1,84 |
0,72 |
Omega ratio (L=0%) |
2,11 |
1,36 |
1,005 |
2,4 |
1,82 |
1,19 |
Maximum drawdown, % |
13 |
19 |
50 |
25 |
25 |
160 |
Kalmar ratio |
9,69 |
3,11 |
0,02 |
3,20 |
6,72 |
1,35 |
Skewness |
1,04 |
1,38 |
-0,61 |
8,94 |
5,21 |
3,31 |
Excess kurtosis |
6,62 |
29,86 |
12,66 |
131,48 |
51,6 |
70,43 |
Skewness/kurtosis ratio |
0,108 |
0,042 |
-0,039 |
0,066 |
0,095 |
0,045 |
Table 6 – Out-of-sample performance of Quinn’s trading model
Out-of-sample period, Quinn’s trading model |
||||||
Market |
Russia |
Italy |
||||
Pair |
“4-6” |
“6-10” |
“2-15” |
“2-5” |
“5-10” |
“2-10” |
Annualized return, % |
17 |
66 |
10 |
2492 |
75 |
-75 |
Annualized volatility, % |
33 |
37 |
37 |
2897 |
180 |
451 |
Sharpe ratio |
0,52 |
1,77 |
0,27 |
0,86 |
0,41 |
-0,16 |
Omega ratio (L=0%) |
1,12 |
1,51 |
1,08 |
1,25 |
1,08 |
0,969 |
Maximum drawdown, % |
40 |
10 |
38 |
174 |
118 |
824 |
Kalmar ratio |
0.43 |
6,49 |
0,26 |
14.31 |
0,63 |
-0,09 |
Skewness |
0,12 |
3,82 |
1,01 |
3,64 |
-0,076 |
-0,049 |
Excess kurtosis |
10,81 |
47 |
17,43 |
33,97 |
2,70 |
3,10 |
Skewness/kurtosis ratio |
0,01 |
0,08 |
0,05 |
0,10 |
-0,013 |
-0,008 |
We consider the Sharpe ratio as the most important risk-adjusted performance measure, as it allows to evaluate risk/return profile of trading models independently and without association with any market index. The best trading model by the out-of-sample Sharpe ratio (3,44) among the all observable markets and trading models is the PR trading model on the 4-6 Russian OFZ futures pair (cumulative return chart is on the chart 1).
Chart 1 – Cumulative return, OFZ4-6, out-of-sample, PR
The Quinn’s model shows the best out-of-sample result by the Sharpe ratio (1,77) on the 6-10 Russian OFZ futures pair (cumulative return chart is on the chart 2). The best trading model by the Sharpe ratio (1,84) in the Italian BTP bond futures market is the PR on the 5-10 futures pair (cumulative return chart is on the chart 3).
Chart 2 - Cumulative return chart, OFZ6-10, out-of-sample, Quinn's model
Chart 3 - Cumulative return, BTP5-10, out-of-sample, PR model
In order to define whether our proposed statistical arbitrage bond futures trading model PR is more successful than the analogical model of predecessors (Quinn’s model), as well as to identify which market is preferable for executing statistical arbitrage, we developed an originally scoring system of comparative evaluation trading models performance.
We have chosen 4 ratios (Sharpe ratio, Omega ratio, Kalmar ratio and Skewness/kurtosis ratio), that represent full picture of risk, returns and properties of returns distribution from our point of view. Each ratio implies that the higher its value the more beneficial a trading model is.
Sharpe ratio is the most actively used risk-adjusted performance measure in financial markets. It is calculated as a ratio of average return of a trading model to its volatility (standard deviation). The volatility represents the risk.
Omega ratio is the ratio of probability-weighted upside (area of distribution above threshold) to probability-weighted downside (area of distribution below threshold). We set a threshold to be equal to 0. As well as Sharpe ratio, Omega ratio is a risk-adjusted performance measure, however considers all moments of distribution including skewness and kurtosis.
Kalmar ratio is a ratio of a return to a maximum drawdown. Maximum drawdown is interpreted as a risk measure.
The sense of Skewness/kurtosis ratio is based on the commonly accepted view that the positive skewness is a positive trait of a trading model, as it has a long right tail of a distribution, whereas the negative skewness is bad, because such distribution would have a long left tail. At the same time the kurtosis reflects the risk, because the high kurtosis often means fat tails of a distribution of returns.
For comparison of the PR trading model and the Quinn’s trading model a mean value of each ratio of each pair was calculated for each trading model.
If a trading model had excess over a competing trading model by in-sample Sharpe ratio, this trading model received 1 score, the other received 0. For the excess in other three in-sample ratios, the winning trading model received 0,5 score for each ratio, where it wins, the other received 0. For the out-of-sample ratios the order of scoring is the same, but the winning model receives 2 times more scores than analogical cases of the in-sample ratios.
After that a sum of scores on the both periods (in-sample and out-of-sample) was calculated, and trading model that had highest total score is considered as more advantageous.
The methodology of comparing the performance of markets (in our case – Russia and Italy) is the same as the described above for the trading models, however it should be noticed that for a calculation of mean ratio for market, all the pairs of each trading model (PR and Quinn’s) should be considered.
The evaluation results received with a help of our proposed scoring system are presented in the table 7 and in the table 8. In the table 6 we observe that the Russian government bond futures market produces a higher performance for statistical arbitrage than the Italian by the score of 4,5:3. We explain this result by the lower liquidity of Russian market, that also produces a heterogeneity of the liquidity levels among the instruments. As a result, the difference between the liquidity levels of a traded pair’s futures generate relative value statistical arbitrage opportunities.
Table 7 - Comparison of markets’ performance by scoring system
Comparison of markets’ performance |
||||||||
Period |
In sample |
Out of sample |
||||||
Market |
Russia |
Italy |
Russia |
Italy |
||||
|
ratio |
score |
score |
ratio |
ratio |
score |
score |
ratio |
Mean Sharpe ratio |
1,87 |
1 |
0 |
0,43 |
1.22 |
2 |
0 |
0,78 |
Mean Omega ratio |
1,86 |
0,5 |
0 |
1,11 |
1,36 |
0 |
1 |
1,45 |
Mean Calmar ratio |
15,42 |
0,5 |
0 |
4,71 |
3,33 |
0 |
1 |
4,38 |
Mean asymmetry/kurtosis |
0,049 |
0,5 |
0 |
0,042 |
0,041 |
0 |
1 |
0,113 |
Score for the period |
|
2,5 |
0 |
|
|
2 |
3 |
|
Total score |
|
4,5 |
3 |
As we can see in the table 7 our PR trading model strongly outperform the Quinn’s trading model with the final score of 7,5:0.
Table 8 – Comparison of trading models by scoring system
Comparison of trading models |
|||||||||
Period |
In sample |
Out of sample |
|||||||
Model |
PR |
Quinn |
PR |
Quinn |
|||||
|
ratio |
score |
score |
ratio |
ratio |
score |
score |
ratio |
|
Mean Sharpe ratio |
2,22 |
1 |
0 |
0,56 |
1,39 |
2 |
0 |
0,61 |
|
Mean Omega ratio |
2,88 |
0,5 |
0 |
1,16 |
1,64 |
1 |
0 |
1,16 |
|
Mean Calmar ratio |
18,24 |
0,5 |
0 |
1,89 |
4,01 |
1 |
0 |
3,70 |
|
Mean asymmetry/kurtosis |
0,067 |
0,5 |
0 |
0,018 |
0,053 |
1 |
0 |
0,036 |
|
Score for the period |
|
2,5 |
0 |
|
|
5 |
0 |
|
|
Total score |
|
7,5 |
0 |
||||||
We calculated the annualized Sharpe ratio of MOEX index to check if the proposed trading model generate abnormal risk-adjusted return. For this purpose, we used daily close price series of MOEX index for a period equivalent to the out-of-sample period of trading models’ back tests and got the Sharpe ratio of 0,75. It is 54% less than the mean out-of-sample Sharpe ratio (1,39) of PR statistical arbitrage trading model.
Thus, we observe that trading model, generating abnormal risk-adjusted return is possible, which is consistent with first implication of the AMH mentioned above.
The proposed PR trading model have a greater in-sample and out-of-sample back test results, than the trading model of predecessors (Quinn’s trading model) and it allows us to say that improvement of the bond futures statistical arbitrage trading model is achieved.
The PR trading model dominance is predicated by the fact that it is more complex and has more optimizable parameters (4 against 1 in Quinn’s model). The dominance of our trading model is also due to its more flexible indicators, which adapts generation of signals to changing market conditions in a more accurate way: the PR model uses the moving Standard Deviation for lower and upper bounds for triggers, while Quinn’s model uses constant percentage above/below mean spread for such bounds. Consequently, the second and the third implications of the AMH are also proved.
References:
Economic indicators. Trading Economics data provider [Electronic resource] – 2018. – URL: https://tradingeconomics.com/ (date of access: 25.11.2018)
Quinn B., Hanna A., MacDonald F. Picking up the pennies in front of the bulldozer: The profitability of gilt based trading strategies // Finance Research Letters, 2018, V.27, № 4, p. 214-222