There are some things computers can’t really do. They can’t enjoy a good burger. They don’t seem to understand the importance of knafa. Ok, perhaps these aren’t really that important. However, people are still pretty good at doing many things, that computers haven’t been able to do well, at least until recently. Any toddler can recognise simple common objects like a car or a house. Yet, computers did not seem to be that good at what seemed to be a relatively trivial task for a human.
Of course, this has now all changed. Computers have become incredibly accurate when it comes to image recognition. One of the reasons is that we now guide computers to approach the problem in a different way to humans. Typically, when we try to recognise something within an image we use obvious features. For example, if we’re trying to recognise a rabbit, we’ll look for something with big ears etc. Traditionally, we’ve programmed computers to look for features such as edges, corners etc. in an image. In other words we need to specify and “handcraft” the features.
In recent years, convolutional neural nets (CNN) and similar models have been used. Rather than having handcrafted features, we let the algorithm to discover higher order features during the training process, which will involve feeding it a large number of photos for training. It’s been showed that the accuracy of image classification is much higher, using such an approach compared with handcrafted features. It is helped by the fact that the problem doesn’t change over time. A car always looks like a car! This contrasts to financial data, which tends to be nonstationary (ie. its characteristics change over time).
Once we can make sense of images and can identify objects and the image more broadly in an automated way, it opens up a lot of possibility for using this type of “alternative data” in investing. We can use satellite imagery of retailer car parks to count the number of cars present. This data can be used to estimate revenue at the retailer on a high frequency basis. We can analyse night lights to proxy GDP for those countries where economic data is patchy. Indeed, these examples are discussed in The Book of Alternative Data, which Alexander Denev and I are writing.
I’ve been dabbling with a few Python libraries geared for computer vision, including OpenCV, as well as cvlib which is a simple wrapper which sits on top. It takes only a few lines of Python code using these libraries to use a pre-trained model (based on YOLOv3 which uses a neural network) which can identify common objects in an image.
The main drawback of using something a neutral network for this type of thing is that it can be a black box. It can be tricky to explain why the model has given a certain output, given the features it discovers aren’t always “obvious” like handcrafted features like edge. However, we would argue that at least in this case, we can understand how accurate the model is (we can tell if it’s counted a car just by looking at it). I think in the coming years, we’re going to see a lot more image data being used in the trading process, given more data availability and also the tools to structure the data.