When should you release a model in production?

What’s the general process for buying a burger from a restaurant? You look at a menu, you order, and the waiter or waitress returns to your table, with a burger, ready for you to eat. They might offer some additional condiments, and then you can enjoy your burger. Next time you go to the restaurant, perhaps they’ve improved the burger, and it tastes even better, closer to what a “perfect” burger tastes like.

 

Alternatively, they you could order and wait several months till they’ve “improved” the burger, even if it tastes better. That’s not exactly practical. If you go to a restaurant you kind of want a burger then, because you’re hungry, not to have one in several months time.

 

When you create a financial model, it’s kind of similar. It’s a question I’ve encountered often in my career, whether it’s at Turnleaf Analytics, which Alexander Denev and I co-founded to forecast economic variables like inflation using machine learning or in the past, when I’ve been assessing whether or not to deploy a systematic trading strategy in production, whether it was a bank or on my personal account. Indeed, it’s a subject Alexander Denev and I discussed at length in The Book of Alternative Data.

 

The first point to note, is that you’ll never be able to create a “perfect” financial model, with 100% accuracy, and indeed in finance, your level of accuracy (or measure of success) can be a lot lower, and yet the model can still be monetised. If you’re better 50% of the time (depending on the return skew), that can be a good result. You will instead have the best model you can build at that time. Also the longer it takes to put something into production the more time you don’t benefit from the output of any model.

 

You of course need to judge whether the model is good enough to be released into production. There are a few checklist questions you can ask (and I’m sure there are lot more too):

 

Have you made sure the model is not overfitted when backtesting? This is easier said than done, but you can look at how the model performs:

  • when perturbing the parameters
  • during different market regimes
  • during both in-and-out-of-sample and using cross validation (of course, you need to be careful with cross validation, that you are not training on future data samples)

 

Has the model behaved in paper trading similar to that in the backtest? Paper trading (or trading in very small sizes) can be a good check at the end of the backtesting process, in terms of things like inaccurate transaction cost assumptions (although again, with small size this may not show up anyway), lookahead bias etc.

 

Are there obvious improvements you can make, whether in terms of the model itself or the data you feed the model and how long will it take? Sometimes, it is not clear what improvements you can make. Obviously, there are generic improvements you can think of, like “find more data”, but this is very open ended, unless you have an idea which data to use. You might also have the situation, where the expected gain in accuracy isn’t worth the extra effort, and instead that time might be better spent on a totally different model, which could diversify your portfolio more. Or if the signal you are trying to model is of a low capacity, waiting too long, could see others in the market exhausting the alpha from it.

 

Even once you release the model, you need to be continually monitoring its live performance. I remember once I started my career, I thought that once you created a model that was kind of it. Instead, it’s very much the beginning. Watching a model working in a production environment really is crucial, and you learn a huge amount from it, and you put back all these lessons into improving a model. Somehow I’ve got many ideas for improving model when monitoring a model, which I did not necessary think of during the backtesting stage. As I always right, you cannot backtest pain, but you most certainly can feel it in production!

 

It isn’t easy to know when you should release a model in production, but I’ve tried to outline a few ideas which can give you an idea of when that point might be. The difficulty is that you can always improve a model somehow. You need to make an educated judgement whether the current model you have is good enough to be released at that point in time. If you wait many years to release what you think is a perfect model, then you will have lost any signal generated from a lesser model, which could well have been used to generate returns in the meantime. You don’t want to order a burger, and wait 6 months!