Backtesting in Python

If I want to know how a burger tastes like in a restaurant, I’ll go there and eat it. However, let’s say I want to know how a burger tasted from a particular restaurant tasted like in 2015? I can’t go back in time. Or let’s say I want to know how every burger joint tastes like in London today? I could attempt to visit every single burger joint and try it, but it’s going to be time consuming and also might have shall we say a suboptimal impact on my cholesterol. The next best thing is do some sort of historical analysis to get access to datasets which have reviews of burger joints, over the past few years, essentially crowdsourcing the task. We could look at the specific ratings given in the reviews, and also use natural language processing on the reviews themselves to gauge sentiment.


When it comes to financial markets trying how to understand how a trading strategy may perform in the future, we could just run it live, like my burger example. However, in practice, just like our burger example we can use some sort of historical data to do a study, to backtest how our trading strategy would have performed historically. We’ll need market data for the assets we are going to trade, and also we might search for other datasets to generate a signal (similar to our burger view example). 


But what type of infrastrucutre do we need to backtest a trading strategy? At the simplest level, we could use a Bloomberg terminal as our dataset, and use Excel to backtest the strategy. Often, we’ll want something a bit more heavy duty than Excel, such as Python. We have several choices when it comes to Python, develop our own backtesting framework, purchasing a backtesting solution or using an open source backtesting library.


For total control, developing your own backtesting library is going to be best, and this is the route that many large quant funds have gone down. You can tailor it to the assets you trade, fit whatever data sources you want, create your own visualization and so on. You can also add execution to your backtester too. The difficulty is that it’s very time consuming and hence expensive from a hiring perspective. For one, even before you begin creating a fancy trading signal, you probably need total return indices. From personal experience, a well featured backtester can be very painful to write, particular for less vanilla assets. Even for assets like FX spot, it is not obvious what the calculations are, and you need to spend ages looking up all the tickers for the various input data. You’ll also need to create your own datafeeds. Then you need to create the backtesting framework itself, which can vary from very simple to quite complicated, depending upon the way you put together your portfolio.


The second solution is purchasing a licence to a backtesting library. These will generally be ready to go for you to create your trading strategies, saving a lot of hassle writing the library. They often also come integrated with data sources, and also sometimes also how their own total return index calculations. Sometimes they will also be hosted on the cloud, so you basically just login to it. For very high frequency data, this might seem quite attractive. For lower frequency data this is probably less of an issue. In some cases, these libraries can also be used to execute trades live. The model for paying varies between vendors, for example how much cloud compute you use, how many users you have internally etc. The downside is that you are creating a dependency, so there’s an element of lock in associated with this and it won’t be as customisable as you won’t see all the source code. If you want to change to a different backtesting vendor, it is going to be challenging, as you might need to rewrite a lot of your backtesting code so it doesn’t reference your vendor’s libraries. Of course you can write additional layers of abstraction to reduce this risk, but this is going to take time. For some the trade off of having it all nicely integrated with data is worth it, it really depends on the use case.


Another solution is to use an open source backtesting library. You have all the source code, so you can customise it more easily. It will also save you time having to do everything from scratch. I’ve worked in FX for over 15 years, developing trading strategies, and it’s informed me about what I really want from backtesting software. I’ve spent over 5 years writing my various open source libraries for backtesting trading strategies. First, there’s findatapy, which is a nice wrapper which makes it super simple to download data from many sources (including Bloomberg, Quandl, FRED etc.) using a common API. Hence, it’s easy to mix and match many different data sources. It also supports the ability to have ticker aliases, so you don’t need to remember vendor specific tickers. I’m continually working on making the ticker mapping easier to use. Then there’s chartpy for visualization of charts, using libraries like Plotly and Matplotlib.


My finmarketpy library focuses on backtesting trading strategies and other market analysis. There’s an easy to use template class, which you fill in with the specific parameters and signals of your trading strategy. It then sorts out all the backtest, aligning time series, calculation of P&L etc. Included in finmarketpy are also total return index calculations in Python for FX spot, FX forwards and FX vanilla options (despite these calculations being fairly well known within most FX desks, and are not really proprietary anymore, they can quite fiddly to write, and I don’t think these are available open source anywhere else!). If you’d like to sponsor the addition of new features or new total return indices in finmarketpy let me know! I also offer bespoke consulting on finmarketpy and Python workshops which include my open source libraries. I’ve also developed tcapy for doing transaction cost analysis. Obviously, finmarketpy isn’t the only open source backtesting library out there, so would recommend checking out GitHub.


What solution you choose for backtesting will depend on your specific circumstances. Whatever you choose, it will cost you something, whether it’s purchasing data licences, hiring quant developers or paying for backtesting software.