Bayesian Sequential Monte Carlo Methods For Traders

I had always been interested in applying statistical methods in improving my trading results. I believe in using technical analysis. But also believe that technical analysis has limitations. Most of the time what appears to be a perfect buy/sell signal is only clear afterwards when the market move has completed itself. Can we apply the standard statistical time series methods to improve our trading results? This is an important question that I will try to answer in this post. Time series analysis is an important subject. You can find many book written on it. But can we use it practically in our trading system? We can use it. There is a new simulation method that seems to work well. It is known as Sequential Monte Carlo Method. I want to discuss the Sequential Monte Carlo Method in this post and see if we can use it to predict currency pairs like GBPUSD and GBPNZD that have highly nonlinear behavior. Did you read the post on Forecasting Bitcoin Crash well before it happened? Below is the screenshot of Dow Jones Index falling heavily like a stone. Can you forecast such a move in advance?

Bayesian Trading Models

Financial markets are complex adaptive systems that have evolved over the last many years. Trading systems and strategies that worked very well in 1980s and made many millionaires like the Turtle Traders don’t work any more. Technical indicators that worked no longer work. Markets have adapted and evolved just like any other complex adaptive system. Markets are just millions of people who buy and sell. Markets are just a depiction of the crowd behavior of these millions of people who are daily clicking their mouses and opening and closing orders. We need to understand this fact. What these people think gets reflected in their decisions and ultimately in the crowd behavior that gets reflected in the market price. Financial markers are very fickle. Currencies can move big time when there is a breaking news. Read on post on how GBPUSD fell 300 pips when UK Early Elections were announced.

Modern financial markets are very difficult to predict. In the last few years, there have been a number of flash crashes when price fell down rapidly 1000s of points in less than a minute wiping out trillions of dollars and then suddenly recovered in the next few hours. Today algorithmic trading systems monitor the markets in time measured in microseconds when it comes to opening and closing trades. A few years back, French President made a speech in which he warned Brexit will have a hard landing. This statement was read by a rogue algorithm as having a highly bearish sentiment. The rogue algorithm opened a very big sell order during the Asian market hours and GBPUSD fell around 2000 pips in just 1 minute as there were no one on the other side to take the trade. In the next few hours GBPUSD had recovered almost all the pips that it had moved down. Read this post on how to design algorithmic binary options strategy.

Today risk management is very important for big financial firms, big banks, hedge funds and of course retail traders like us when it comes to trading keeping in view the danger of a flash crash happening anytime. We need robust predictive models that can adopt themselves to the changing market conditions in real time. In this post I will discuss in depth how to build robust risk models based on Bayesian statistics. Bayesian statistics unlike the traditional statistics allows us to start with a certain prior belief and then continuously update our belief with the market data in real time. We start off with a prior distribution that encompasses our belief on how the market will behave. History never repeats itself. We start off with a prior probability distribution that encompasses our belief on how the market will behave. History never repeats itself. Our parametric risk model should be able to encompass uncertainty. This uncertainty is expressed in terms of the parameters of a probability distribution. We use Bayes Theorem to update our prior belief: $$p(\theta|x)\propto p(x|\theta)p(\theta)$$

Financial Time Series Stationary Models Fail
You might be wondering why I have suddenly started talking about Bayesian statistics. If you read most of the books on time series analysis they teach you ARIMA models for predicting time series. And if you have really studied the models well, you must have realized that these ARIMA models only work for stationary time series. Here lies the problem. Financial time series are not stationary. What to do now? Make the financial time series stationary by taking the first difference and if it doesn’t work take the second difference. This is not going to work. Try it and you will find when you take the difference the try to use ARIMA model for prediction, the predictions are simply widely off the mark. ARIMA models or for that matter any time invariant model is unable to predict financial time series well. Watch this documentary on the day in the life of a millionaire forex trader.

When we say time invariant, we mean stationary and ergodic. When a time series is stationary, its mean is zero and its variance is constant it does not change with it. If you have been trading for sometime, you must be knowing that financial time series like the closing price mean is not a constant and it variance is also not constant. One way to make the financial price time series stationary is to convert it into returns. Returns have the useful property of mean being zero. This is one of the main reasons why we find return series to be so useful in financial engineering. Now it is assumed in quantitative finance that closing price time series probability distribution is log normal. So if we take the logarithm of the closing price time series, we will have a normal distribution.

Normal distributions are ubiquitous and used widely in different field. But when we use normal distribution in financial time series predictions we find that the predictions are mostly erroneous. Why? 95% of the data lies within 2 standard deviations from the mean in case of a normal distribution. So if we have a normal distribution, the chances of big moves in the market as well the possibility of a stock market crash are very rare which is not true. We have seen markets moving big times plus we have seen markets crashing so we need a probability distribution that can explain these facts. Financial price time series are not normal or log normal. Rather financial price time series have probability distribution that has fat tails. Read this post on how to predict weekly candle high, low and close.

Now when we build trading models that can predict price, we are infact using probability distributions in our model. As said above, if we use a normal distribution, the mathematics can be easy but the results will always be wrong with the market doing totally different things as compared to what our model had predicted. So incorrect specification of the probability distribution plus other parameters can introduce risk in our trading model. This is known as Model Risk. Our model will be producing forecasts that will be not reliable. Even if we have a good trading model that we consider to be reliable, there is always a possibility that it also gives a poor forecast. So when trading we should always take risk management very important. Things can go wrong anytime. Frequently you will find the forecasts made by the trading model breaking down causing drawdowns to the trading account.

Stationary and Ergodic Time Series Models

Now when building time series models, we try to make them stationary. As said above stationary means that the probability distribution that generates the time series does not change with it. This is a nice assumption. If time series is stationary, it makes life easy for us. Mean of the time series becomes time invariant plus the volatility also becomes constant. So we can choose different samples from the time series with different lengths, all should have the same long run means and volatility which is just the standard deviation. If the joint distribution of the time series is time invariant we call it an Ergodic time series. Ergodicity is a good thing if you can have it in a time series. For an Ergodic time series, Law of Large Numbers holds true and when we have a large large of observations, we can accurately measure the mean and the volatility. But practically in real time, we don’t find our financial time series to be time invariant and ergodic. If we still assume the financial time series to be ergodic and build a trading model based on it, we will get unreliable forecasts as said above.

When we assume that the financial time series is ergodic, we are infact assuming that the stochastic process that is generating the price time series data is time invariant. Stationarity means the parameters that we have used in our trading model do not change with time. By taking a large number of observations we can accurately measure the trading model parameters and then use the trading model in making forecasts. So with more data, we have a better model. This is also the assumption behind data mining and machine learning. But as I will show below, this assumption is not a good assumption and most of the time we will have erroneous forecasts that can give us big losses in trading. Discover my candlestick trading strategy that makes 200 pips with 10 pips.

Stationarity assumption implies that history repeats itself in the future. This is also called mean reversion and quants love mean reversion trading strategies. As more data gets collected, the parameters of a stationary time series model converges to its true value like the flipping of a coin. In the beginning, we can get 10 heads in a row meaning, the probability of getting heads is 1. But over 10,000 coin flips the probability will be close to 0.5 if the coin is fair and if the coin is unfair, it will move to its true value as well. So if the data generating process does not change, the parameters of a stationarity time series model also does not change. Download this Pin Bar Trading Strategy indicator.

But this is the fundamental problem in financial time series data. The financial time series data is only observed once. So if we have coffee futures price data, the path taken by the cotton futures is never repeated. We cannot repeat the process which violates the stationarity and ergodicity condition. British Pound went through Brexit. GBPUSD price path that we have observed in the last few years cannot be repeated. If GBPUSD price process had been stationary we would have repeatedly seen the same price path with addition of some noise. So assuming stationarity is a dangerous thing when building trading models.Assuming stationarity gives us the false belief that by observing more and more price data we can model the market conditions fully and tomorrow will just be like yesterday with slight modification that statistician call noise.

When a probability distribution is stationary, extreme events like the stock market crashes, flash crashes and sudden big moves happen rarely. As said above, if the probability distribution is normal 3 standard deviations from the mean covers 99% of the cases. In 1987 on Black Friday, Dow Jones Index crashed and the price movement was 25 standard deviations from the mean. According to a normal distribution this extreme movement can only happen in billions of years. Yet again in 2008, stock market crashed just after 20 years. In 2010 again there was a Flash Crash that wiped out trillions of dollars so we have to be careful when using a stationarity assumption in building a trading model. So when a trading model predicts that an extreme event probability of happening is only 0.01 while in practical the extreme event happens twice or thrice in a year, it is high time to know that building stationary trading models will not work.

Now if you are a quant working for a big bank, a hedge fund or some big sovereign fund and you build a risk model based on the erroneous stationarity and ergodicitiy assumption, most of the time the forecast will be unreliable and it will not be long before your boss fires you for building wrong risk models. If you are using regime switching models, there will always be discontinuity in the risk model and after market conditions have changed, you model will inform you that market conditions have changed. So there is no way you can figure out that the market is trying to change during the time when the transition from the old to the new market condition takes place.

Bayesian Monte Carlo Simulation Methods

Bayesian Statistics got invented somewhere in 17th century when Thomas Bayes tried to figure out how to update your initial belief with data. Suppose you have an idea about something. Say you want to predict the weather on daily basis. Weather is very important for all of us. Predicting weather is also one of the most difficult jobs. But with application of Bayesian statistics, weather prediction has improved so much that you will be amazed. I have seen snowfall forecast wrong by just 2 minutes. So you can think how much improvement has taken place in recent years. All this has become possible with the help of computers. In weather we start with an initial belief say tomorrow will be sunny. After every 10 hours we check the weather and update our belief and after a few observations we will have a fairly accurate prediction of what the weather will be tomorrow.

Let’s talk about the currency market. Currency market has no physical location. It exists on a big network of computers that communicate the latest bid/ask prices to each other. In the currency market, we can get a new quote or what we traders call the tick data after a few fractions of a second. So you can see how fast things have become. Currency price data is being recorded constantly. If you are using MT4, you can download the data in a csv file any timeframe from 1 minute, 5 minute, 15 minute, 30 minute, 60 minute, 240 minute, daily, weekly and monthly. In Bayesian modeling, as said above we start with a certain belief and as new data comes in we update that belief. All of this happening in real time as we incorporate the new price data into our trading model in real time.

What we need is a trading model that can adapt to the changing market conditions in real time without any discontinuity. As said above, in Bayesian statistics probability is a subjective thing a matter of belief that we can update when we observe more data. Bayes Theorem is the formula that we use to update the probability. In traditional statistics also known as classical statistics we interpret probability in terms of frequency of an event in a large number of trials. Law of Large Numbers tells us when the number of trials become very large, probability should approach its relative frequency. This is the basis of classical statistics. But there are many times when we find it hard to give probability to an event with frequentest interpretation.

For example, the last US Presidential Elections were between Donald Trump and Hillary Clinton. This is an event which cannot be repeated. It is a once in a lifetime event. How to assign a probability to this event In classical statistics, there is nothing we can do about it. We cannot assign a probability to this event as it the US Presidential Elections with Donald Trump and Hillary Clinton cannot be repeated over and over so that we can calculate the realtime frequency of win of any one candidate. However using Bayesian statistics we can easily assign a probability to this event according to our own beliefs. I say Trump chances of winning are 30%. Another person can says, his chances were winning are 50% and so one. So everyone can assign a probability between 0 and 1 to the event.

In the above discussion, just like the above event I said every price path in a financial time series is unique and cannot be repeated. So we cannot thing of it in terms of the classical statistics rather we need Bayesian statistics to update our probabilities in real time using market data. We cannot update these probabilities on own own but we have a unique formula known as Bayes Rule that was discovered a few centuries back. Bayes Rule got forgotten for many many year until it was rediscovered in this century some decades back.

This is the Bayes Formula:$$p(Model|Market Data)\propto p(Market Data|Model)p(Model)$$. When we talk of the Model we are infact talking about the parameters that are used to define that model. For example, if we use the ubiquitous normal distribution in our model, our parameters will be mean and variance. We will use the market data to calculate the means and variance of the distribution according to the model. In the Bayes Rule that I quoted about $$p(Model)$$ is known as the prior probability and $$p(Model|Market Data)$$ is known as the likelihood or the sampling density while $$p(Model|Market Data)$$ is the posterior probability. Likelihood gives us the information contained in the market data. As said repeatedly, predicting financial markets is a difficult and unreliable thing. Read this post on how GBPUSD shot up 500 pips on FOMC Meeting Minutes release and then fell down 500 pips.

Classical statistics deal with the problem of forecasting a time series by making them stationary as said above. The methods used were unit root test. If the unit root of the time series is less than one it is non stationary. It the unit root is 1, it is a random walk which cannot be predicted at all and it the unit root is greater than 1, it is non stationary time series. The other method was that of cointegration. Find another time series that when added to the first makes the resultant time series stationary. But all of these things were artificial artifacts that most of the time would breakdown and we would get very unreliable forecasts of the future. If you carefully examine the Bayes Rule, you will realize it is a simple formula that let’s update our beliefs in a sequential manner which means real time. This is much better than the classical statistics.

In classical statistics, parameters are fixed while data is a random variable. In Bayesian statistics, it is other way around with parameters considered to be random variables while data is considered to be fixed and non random. I personally like this approach. When we have data, it is known and not random. Parameters are always unknown and calculated based on the data so they must be random. We start with specifying a prior probability distribution for the parameters and data is used to condition that prior probability distribution. So the market data that we receive is used to condition the information that we already have about the trading model in the form of a prior probability distribution. Discover how you can get training from #2 ranked forex trader in the world.

Bernoulli Sequential Bayesian Trading Model
In trading, market direction is always important. As a trader, we want to predict the market direction for today. If we can predict that today market will move up, we can avoid entering into a sell trade when a sell signal appears on the lower timeframe. If the market close is above the market open, we have an UP candle and if the market close is lower than the market open, we have a DOWN direction. So we have a binary variable UP and DOWN. We can use logistic regression to build a trading model that can predict the market direction. We can also use Quadratic Discriminant Analysis (QDA) to build a trading model that can predict the market direction. If you have been reading my blog, I have built done that. I have build trading models using logistic regression as well Quadratic Discriminant Analysis.

I failed to point out that when we build trading models using classical statistics, we can only do batch processing to make the predictions. When we market data comes in, we discard the previous model and build a new trading model using a new market data batch. This is inefficient. Batch processing also fails to detect the structural changes in the market data in real time. In Bayesian statistics, we incorporate each new data into the previous trading model by updating its parameters in real time. This helps in detecting the dreaded structural change before it really happens in real time. I have written a course on Particle Filtering for Traders. In this course on Particle Filtering for Traders, I show you how to use Bayesian Sequential Monte Carlo Methods in developing trading models.