Using Python to explore FX market microstructure

I am sure like most readers, I’ve lost count of the number of weeks since things were “normal”. When will things get back normal? Maybe weeks or more likely months? I suppose that using a Python function random.random(), might be the best approach here! However, in the meantime, I’ve been trying to work hard at home, doing client projects and also building out my Python open source libraries.

 

A few weeks ago I open sourced tcapy, a Python library for doing transaction cost analysis (available on GitHub here). I’ve also put up lots of Jupyter notebooks to demonstrate tcapy, which can be run interactively in your browser, using Binder (thanks Thomas Schmelzer for his help here).  I’m also currently working on using Docker to make tcapy easier to deploy (not quite there yet..!) Whilst the main focus of tcapy is transaction cost analysis, analysing your own trade data against market data as a benchmark, I’ve recently rewritten it, so you can use tcapy to explore high frequency tick data, which I’ll discuss now elaborating on some short posts I’ve already on Twitter and LinkedIn.

 

The difficulty with doing calculations on high frequency tick data, is that the datasets are huge. As a result, it can be time consuming, and you have to batch the results. I’ve written tcapy so that it does its computation in a distributed manner, and there’s lots of smart caching of data to speed it up. You have 20 cores? Great, kick off tcapy to use lots of Celery workers, and see your computation take advantage of the compute power at your disposal. I’m also keen to extend the computation engine so it can be used easily with AWS Lambda and similar serverless compute services, if I can find a sponsor for this work.

 

In terms of generating results for market microstructure, I’ve made it easy to use high frequency market data in tcapy to calculate statistics such as the bid/ask spread throughout the data, as well as intraday volatility and you and you can also add your own computations too. These results can be used traders when they are thinking about how to execute their trades. I’ve created a Jupyter notebook which uses tcapy to produce results for the past 6 months for EUR/USD and GBP/USD, and you can run it in Binder here (although you might need to reduce the number of months if you want to run it online, given the relatively limited compute in Binder).

 

So what changes have we observed in market microstructure based upon the results generated in the Jupyter notebook. Perhaps unsurprisingly, we observe that spreads blew out in March at the height of the market panic this year. Our analysis suggests spreads have largely come in since then. In terms of intraday volatility, this was elevated during March in particular compared to the months before. The biggest differences in intraday vol could be seen during those times around the London fix in March around 1600 LDN. If you want the charts and to play the code you can find them all here on Binder.

 

If you’ve got any ideas for other features you’d like me to add to tcapy please do let me know or indeed if you’d be interested in sponsoring the project or contributing to the project more broadly.