Ways to distribute your open source Python code

I recently cooked a burger. We was the objective of cooking the burger? Well, to eat it! Cooking a burger, without eating it, seems like somewhat of a pointless exercise and frankly a waste of food. I suppose you could argue that you might cook a burger to learn how to cook, but ultimately, even if that was the case, you’d probably still want to eat it, right?

When coding, there are also times when you code for the sake of it, when you’re learning about a new tool or language. However, it is all building towards the objective of coding something that folks will actually “consume” to use a burger metaphor. Let’s say you’ve built something pretty cool in Python, and you’ve been coding for a while. It’s now good enough to release out into the world. The burger has been cooked to perfection, so to speak, and is ready to be served. How can you distribute your Python code so that people want to consume it? I recently went through this process with tcapy, my transaction cost analysis library, which I recently open sourced.

If it’s an open source library, GitHub is usually a good place to host your code (if’s a closed source project, you could potentially use a private repo, if you’re willing to share the code with a small number of clients). Of course GitHub, doesn’t just have Python code, it has code in all sorts of libraries. Folks will then be able to clone the GitHub project and play around with the code. It will also make it easy to keep track of changes because it’s all under version control. Developers can also post an issues they find with the code, which you can address. It’s also important to make sure you have good documentation to make it easier for developers to understand how to use the library. For my tcapy library, I’ve been adding many Jupyter notebooks to show how to use it.

To make it easier to install for developers, each time you create a new release you can upload your Python library to PyPI, so people can simply run pip to install it. You can also get pip to install all the Python dependencies too for your library, so users don’t have to manually install each one. You might also consider creating a conda package as well, which is similar to pip. After installing via pip, developers can then simply import your library to call it and use it within their own projects.

Let’s say though that you want people to use your Python library like an application too, because it has a GUI which you’ve developed. In other words your user is not always another coder, who will call the library, but someone who is directly interacting with your GUI application. tcapy is like this. Developers can call tcapy directly and develop their own solutions on top it. However, end users, like traders can also use the web GUI and Excel/xlwings interface too, without having to code a single line of Python. Another point to bear in mind is whether it is worth trying to split up your library into smaller easier to use chunks, and that is something I’m currently thinking about for tcapy too.

Your software library might also have many other dependencies that also need to be installed, such as a web server, databases in addition to the actual library and its Python dependencies. One way to deploy your application and get it to install these dependencies is to use Docker. For my tcapy application, there are quite a lot of dependencies, in addition to the Python libraries which it uses.

Docker enables you to create a container to run your application, and install all the dependencies, which can also be run in their own containers. The way to think of a container is that it’s like a lightweight VM, but which runs faster than a VM, because it’s not replicating the whole OS for every application. Since it’s a sandbox it can be done in a way that doesn’t impact your main host OS, which can often cause issues. As such it just takes a user a few commands to install the Docker container and run it. Since the application is all Dockerized, it’s much easier to install. If it’s easier to install, users are more likely to use it! All those hours coding are only going to be worth it, if folks use it. It’s also worth pointing out that Docker can be used for any project, not just for open source tools. If you cook a burger, it must be eaten!

I’d like to thank Thomas Schmeltzer for helping me create the necessary code to Dockerize tcapy. Thomas also created a Binder version of the Jupyter notebooks, that enable folks to use the tcapy notebooks without having to install anything locally.

Developing your software is only the first part of the project, whether it is open source or not. The next part is distributing it. Careful thought about how you distribute your software is also important. The easier you make it follow what you’ve done for developers, the more likely they’ll use your library. Making it easy to install, reduces the barriers to using your software, and it should help you attract more users.

General, Python

Ways to distribute your open source Python code

by Saeed Amen • July 19, 2020

Post navigation