Moving from Matlab to Python

20181006 Shadow

What’s your favourite burger joint? This is probably the question I get asked most. Perhaps a more relevant question I get asked, is what are the best ways to move from Matlab to Python. For many financial firms, it is a very pertinent question. I used Matlab extensively during my career, particularly when I was at Lehman Brothers. At the time, it was the best tool to rapidly develop market analytics and systematic trading strategies. Over the years, Python usage has grown and as a result, many folks are now thinking about migrating from Matlab to Python. From a personal perspective, I’ve switched over from Matlab and haven’t used it for many years. It’s true that not everyone necessarily wants (or needs) to switch from Matlab. Below, I’ve outlined reasons for switching and also the process you might use to migrate from Matlab to Python. If you are interested in using Cuemacro to help you migrate to Python, let me know!

 

Why should you now think about moving to Python from Matlab if you work in financial markets?

 

This isn’t an exhaustive list, just a few reasons why folks have told me they are moving to Python.

 

Python is open source. This is probably one of the main reasons people want to swap. In a large organisation the cost of many Matlab licences can quickly build up. By contrast Python is open source and free, so we can deploy it to as many clients we want without additional costs.

 

There are many data science libraries in Python. With Matlab, there are toolboxes which provide additional functionality. However, with Python we have access to a much greater array of open source libraries, which replicate a lot of the functionality of Matlab. In particular, there are many Python libraries these days which make it a particularly useful language for market practitioners, such as Pandas (for time series), Numpy (for matrix operations) and Matplotlib (for visualisation). There might also be Python libraries which offer more functionality and aren’t available in Matlab (it is possible to call Python from Matlab and vice-versa, but if we retain a Matlab dependency it will complicate maintenance in the future and also require us to pay for future Matlab licences). You could argue that there are sometimes too many Python libraries, which can actually complicate things.

 

More people learn Python these days. Python is now one of the most popular languages and you are more likely to find Python developers these days than Matlab developers. In practice, I do admit that there are many similarities between Matlab and Python though in the syntax (there are some notable differences, some of which are explained here). Also the general approach to working with datasets is similar, such as trying to write vectorised code. This should be helpful for developers who use Matlab a lot to move over to Python too.

 

Python is general purpose language. Python is not purely a language for doing data science, although there are many libraries for this. It’s a general purpose language, and can be used to do all sorts of things, which can be useful when it comes to exposing your Python code to users for example to Excel and via a web server (ok, I don’t know if this is easy or not in Matlab to do, but I do know it’s straightforward in Python)

 

How can we go about migrating from Matlab to Python?

 

Let’s go through the steps below which you might need to undertake if you want to migrate from Matlab to Python.

 

What’s the problem we want to solve: what’s the situation in a financial firm who uses Matlab?

Let’s take a large investment firm. It is likely they have millions of lines of Matlab already written. How do they go about migrating from Matlab to Python? There are several steps we need to think about before we even write any code, which we discuss below. For smaller firms, the process is likely to be a lot easier, given they have a lot less legacy code in Matlab and it’s much easier to understand where existing Matlab code is being used.

 

Who is responsible for Matlab code at our firm and what Matlab code do we have? 

 

This is the most important step. We need to know what Matlab code we have scattered across our firm! We should be able to create a list of those developers who have Matlab licences. We can then interview them to understand which Matlab code is being maintained and developed by them. Furthermore we need to know what Matlab dependencies they are using (and also any dependencies in other languages like C). We should also understand who are the end users of this code (and its output). It could be the case that these end users don’t call the Matlab code directly, but simply rely upon its output. The last thing we want in our migration is to have issues because of dependencies we don’t know about. Typically we might have several levels in our Matlab stack (and this is a massive simplification!) and we describe some below.

 

  • Lower level code – such as loading/storing market data which is (likely) shared across the firm
  • Pricing level code – could be valuation metrics, rate/vol curve interpolation which are also shared
  • Model level code – each team is likely to have their own models build on top of the lower levels, this could be risk engines, economic forecasting models, systematic trading models etc.
  • User level code –  this could be in the form of GUI applications with visualisation or reports etc. – this could also be shared code

 

Developing a strategy for converting our Python code

 

We need to draw up a list of applications which we want to convert to Python once we have interviewed the relevant Matlab developers in our organisation.

  • Which of these applications contain code which is heavily proprietary and which are likely to be quite generic?
  • Are there applications across our organisation which are basically replicating the same functionality?
    • For example does each team have their own low level code for downloading/storing market data?
    • We should instead focus on building firm-wise solutions for these types of problems which should save the firm both time and money.
    • Let teams concentrate on where they can add real value, rather than rewriting the wheel!
  • Can some of these Matlab applications be replaced (nearly) directly by an existing open source Python library? Over time, lots of functionality which was proprietary in financial firms, has rapidly become commoditised. I remember when I was at Lehman, we had several developers maintaining a very nice Java applet for doing charts. Today, this type of functionality is available in libraries like Plotly.
  • We are also likely to need to understand which Python libraries we could use as dependencies. This requires a deep understanding of the Python ecosystem for financial libraries, to understand the various pros and cons of open source libraries for the task at hand.
  • We can’t simply convert all the code at once (or randomly), we need a divide and conquer type approach done systematically – in a particular application, we need to convert the smallest, lowest level functions first, make sure they work, then slowly convert the higher level code
  • We should also take the opportunity to assess the software design of existing Matlab applications and take the opportunity to think about whether they should be redesigned to fit an object oriented paradigm:
    • Both Matlab and Python support object oriented programming (although, have to admit I prefer the way it is done in Python).
    • However, in practice lots of Matlab code is unlikely to be in OOP
    • We might find that redesigning applications could make them more maintainable in the future, even if it makes the migration take longer
    • It is tricky because the more “redesign” we do the more complicated the migration process is going to be (in particular if our redesign impacts how external applications can call it)
  • For some Matlab applications which are rarely used, we might choose to decommission them, or simply not convert them and write Python wrappers if people need to use them (in some cases we might not have much of a choice, if there’s no Python equivalent, and it is too difficult to convert the code)

 

How can we do the coding phase?

 

Once we have a strategy in place for which Matlab applications we want to migrate (and also potentially have also thought about any software redesign), we can get down to actually doing the coding. Typically, the process is likely to take several years, in most firms. I wish there was an *easy* way to do this, but with a lot of good planning and a decent strategy, it should hopefully reduce the problems involved.

  • There are some automated tools for converting Matlab code to Python code. However, I suspect in practice that in practice much of the work will need to be done by hand, if the code is not trivial.
  • We need to do unit tests to make sure that our Python output is the same as our Matlab output (we can use tools like pytest to do this)

 

There is a great paper from Enthought which describes some approaches for converting Matlab code to Python available here. There is a lot of material on the web about this subject, including a great note describing how the large quant firm AHL migrated to Python (from R).

 

How can Cuemacro help you to migrate from Matlab to Python?

 

We have extensive experience working in Python, including developing finmarkepty for backtesting trading strategies (has nearly 2000 stars on GitHub). We have a lot of experience working with many Python libraries, including finance specific ones such as Bloomberg’s Python API and more general purpose libraries such as Dash. We have also developed libraries in Python for transaction cost analysis. At the same time, we have deep expertise in markets, developing systematic trading strategies both in spot and vol space – hence we understand the problems which are faced by investment firms. The problems which an investment firm is trying to solve differ from that of a pure technology firm.

 

We can work as independent consultants for your firm to help you in your migration. We can independently create a high level strategy report for your firm, interviewing your various Matlab developers to understand what code you have, and which code can be converted. We will also interview end users to understand which applications they use day to day. Our recommendations in our report will also be completely independent. Drop me a message if you are interested in hearing how Cuemacro can help you migrate from Matlab to Python!