Category: Data

Machine learning for data cleaning

Every data scientist wants to do the fun stuff, ie. building models! You might be forecasting a variable. Perhaps you are classifying a variable into a specific group. Machine learning is usually part of this, whether it’s a simpler linear model or something more complicated that can take into non-linear relationships, whether that is a…

Machine learning in macro

There are buzzwords and there are buzzwords. The buzziest (if that indeed is a word) of buzzwords in technology is that of machine learning, whether it’s using machine learning to improve image recognition, natural language processing etc. Although, I’ve got to admit, there’s still a long way to go… I still repeatedly get advised to…

It’s all about the data

I used to be an avid Formula 1 fan. I do still watch occasionally, but admittedly not as much as usual. Ok, there’s only one driver in the car, so on a superficial level, it doesn’t appear to be a team sport. Each team has two drivers, and in many instances they will race each…

Alpha vs beta datasets

There are some ingredients that are pretty common. Just because they’re commonly used, does not make them unimportant. If anything, they are the most important when it comes to food. Without common ingredients like flour or oil, it’s going to be quite limiting. By the same token, very rare ingredients are not necessarily the most…

Breaking down quant questions

During the pandemic, a question of where to travel to, hasn’t been uppermost in most of our minds. After all, there’s little point in thinking of travelling when it’s been so difficult, and when our minds have been occupied elsewhere, particularly if you’ve been impacted most by the coronavirus, through illness or through work, such…