I went on a long walk yesterday. You could argue that it was pointless. I was going nowhere in particular, and ended up where I started. However, on the way I got a bit of exercise, and it was a nice break from being at home. Another good reason is that I often seem to come up with ideas when I walk and it’s a good way to explore.
So in this column I’d like to go with this theme of trying something a bit different: yes, trying Linux for Python development, an idea which I thought up on one of my walks. If you’re reading this post in a computer, it’s likely you’re on Windows. However, pretty much all the internet infrastructure that’s delivered you your page has likely used Linux.
So what are the benefits of using Linux, if you are coding in Python. First, it’s free! It’s also quite robust for production. Some Python libraries are better supported on Linux (and in some cases the current versions won’t work on Windows). A lot of these might be useful from a financial perspective:
- RAPIDS – Python libraries which include cuDF (GPU accelerated DataFrames)
- datatable – high performance Pandas-like library, similar the version on R
- celery – distributed task framework
There are many Linux distributions. Typically I use Ubuntu as my Linux of choice, which is a Debian distribution. There are also other Debian distributions which are particularly well suited to use on desktops such as Mint and Pop!_OS, which are built on Ubuntu. Another group are the RPM distributions, which include Red Hat. In general, I’ve found Ubuntu easier to use. So how can you about using Linux? We’ll go through a few ways below. Of course there are many other ways and distributions you could use, but hopefully this will be a start!
Spin up a Linux box on the cloud
This is one of the easiest ways to use Linux. Services like Amazon’s EC2 allow you to load up a Linux box on the cloud very quickly. You can then log into it remotely using SSH using a program like PuTTY or my current favourite MobaXTerm. You can also remotely debug code using PyCharm. You do need to be careful in monitoring costs, don’t spin up lots more computation or storage than you need. You can use a hybrid of developing on your own local Linux installation, and then doing production on a Linux box too.
Use a virtual machine on your computer and install Linux on that
You can download a program like VirtualBox which lets you install Linux in a sandbox environment, on Windows. It’s completely isolated from your Windows installation. The drawback is that your VM will likely be slower than running Linux natively.
Install Linux on WSL Windows Subsystem for Linux
Microsoft have added WSL (and more recently WSL2, which I’ve just switched to) as an optional Windows feature in recent years. It allows you to install Linux inside Windows. Unlike the the VM approach, it’ll largely run Linux at native speed. There’s a choice of several different Linux distributions to install, including Ubuntu, which I’ve used on WSL.
Furthermore you can easily access Windows resources through it (although accessing some components like GPU is only in beta at the moment). It’s also easier to use than the VM approach in my view. It’s probably what I would recommend for anyone who has Windows and wants to play around with Linux on a local machine.
Install Linux on dual boot on your desktop
If you already have Windows, you can install Linux directly so it’ll dual boot. At startup you can choose whether to boot into Windows or Linux. Despite using Linux for nearly 20 years, I have to admit I had never tried this, till the past week! First important point is make sure you backup all your data before attempting to do this. You’ll need to partition your hard drive to give it an area where Linux can install.
I installed Ubuntu via a USB stick. When using Linux as a desktop operating system and using it via the GUI, you might have certain issues in terms of finding drivers for USB devices or other peripherals. I installed it on a secondary machine. The main reason I installed Linux on this machine was to use Python to code on the GPU to use RAPIDS (which includes a GPU style DataFrame library). I’ll be writing about RAPIDS more in the coming weeks!
This meant I had to install the latest NVIDIA driver for my graphics card on Ubuntu. It did end up working but I had to fiddle with my BIOS settings. With Windows this stuff tends to work much quicker. Other things I needed to do was to give Linux access to my Windows partition in the same machine (using ntfs-3G) and also to a Windows share on my main computer. I’ve also created a shell script to write down these various steps in case I need to repeat it again.
In practice, I usually remote login via SSH into this machine with MobaXTerm or sometimes use Windows’ Remote Desktop Connection if I want to use the desktop Ubuntu GUI (which was fairly easy to setup). Using the desktop GUI isn’t 100% necessary if I just want to remote login, but I thought it would be fun to try it! I don’t think I would exclusively use Linux as my main desktop operating system given the issues with peripherals and also given I use a lot of Windows applications including Excel (although it might be possible to run some of these via WINE).
Use a Mac
Mac OS X is a variant of Unix, so you basically already have something certain close to Linux ready and waiting.
Whilst using Linux as a desktop operating system might not suit everyone, having access to a Linux box to remote login for Python development is definitely useful. It gives you access to more Python libraries and provides a robust way to run production code.