Publish a Python Wheel to GCP Artifact Registry with Poetry

Recently, I’ve been building a project in Python that doesn’t have a Docker image as output. Instead, I need a runnable file. Why? Because I need to talk to the machine directly and not to Docker. Because I need to talk to the GPU drivers directly and not to another abstraction layer.

So, I needed to create a runnable file in Python. I needed to create a wheel file. All this with Poetry and integrate it into my CI/CD pipeline.

Let’s break down the steps.

Building the Project With Poetry

My first solution was, when an VM is created in my Compute Instance Group, I clone the git project, install Poetry, sync the project and run it. Just like in my laptop.

But! I need a token to clone the project. I never liked this solution. So, I’ve been thinking around for an alternative.

I wanted a solution near to the Docker workflow: build locally, push to a registry, and download the latest version at the startup. I searched a way to do the same with a python project. here come the wheel files.

By default, I build my project with poetry sync, because I don’t need a runnable file as I mainly use Docker images. But Poetry allows me to configure the install command a little bit further. Let’s start configuring that command to build a wheel file.

Here is an extract of the pyproject.toml file:

[tool.poetry]
# ...
packages = [
    { include = "main_dir" },
]

It indicates to Poetry to include the main_dir directory when building a package, when building a wheel. Now the poetry build command does not require the no-root parameter anymore. And the wheel file will be available in the dist folder.

Create a Python Registry to Store My Wheel File

The project is hosted in GCP. And GCP has an artifact registry that allows me to specify which type of files I want to save there: Docker images, Python Wheel files, Java JAR files…

Once the artifact registry is created, I need to configure two things: Poetry to push to the adequate place, and the VM to pull from the correct place.

Let’s start in the Poetry side. Poetry has another configuration file, poetry.toml, which is optional and accepts this kind of configuration:

[repositories.my-registry]
url = "https://europe-west1-python.pkg.dev/my-project/my-registry/"

Now, after building my Poetry project, I can publish to the registry. How?

poetry publish --build --repository my-registry

But! (Yes, another but) I need to remove the dist folder, which is the default output folder to build the wheel file. Otherwise, it will ask if I want to override the existing files. and until now, there is no option to force a “yes, override the files, go ahead”.

rm -fr dist/
poetry publish --build --repository my-registry

But! (let’s continue with the buts) Each version I publish will have the version number set in the pyproject.toml file. Overriding versioned files is never a good practice. So, I must use a dynamic version number in my pyproject.toml file.

sed -I.back 's/^version = "[^"]*"/version = "1.0.post'"$(date +%Y%m%d%H%M)"'"/' pyproject.toml
rm -fr dist/
poetry publish --build --repository my-registry

That’s all on the side of publication. What about pulling the file?

As I’m in a VM running in GCP, and the artifact registry is also in GCP, I must add the adequate permissions to do the operation.

Once, configured all the IAM stuff, I need to run the following command in the VM (it will help me for the next steps).

gcloud artifacts print-settings python --project=my-gcp-project \
    --repository=my-registry \
    --location=europe-west1

This will print all the configuration I need to put into my VM instance. It will look something like this:

cat > $HOME/.pypirc<< EOF
[distutils]
index-servers =
    my-registry

[my-registry]
repository: https://europe-west1-python.pkg.dev/my-gcp-project/my-registry/
EOF

mkdir $HOME/.pip
cat > $HOME/.pip/pip.conf<< EOF
[global]
extra-index-url = https://europe-west1-python.pkg.dev/my-gcp-registry/my-registry/simple/
EOF

Ok, now both my CI/CD and my VM instance are ready to send a Python wheel file from one to the other.

Download a Python Wheel file and Run It

Once the file is available in the GCP Artifact Registry, I need to pull it and run it. It’s quite simple now.

pip install my-project

This first command will install in the machine (no need of Poetry in the production instance) my project. pip will look for the latest version of my-project into both the pip repository and into the extra index my-registry.

In my case, I’ve added my-registry as an extra index. Why? Because my project needs a lot of dependencies (Flask, SQLAlchemy, Pandas…), and I don’t want to store all of them into my GCP Artifact Registry (and pay for that). So, I let pip download what’s available in the public registry, and add an extra registry where to look for the other libraries (which is my project).

Once the project is downloaded, I can run it simply as a Python project.

python -m my_job

I have this command because my root module is called my_job and I have a __main__ file which is the starting point of my application. You must adapt this step to the way you start your application.

Conclusion

Docker images are not the only way to run application. I can run directly wheel files for Python projects or JAR files for Java projects.

The advantage of Docker images is that I can configure all the needed dependencies on it. A thing that I must do in the VM manually and create an image template to start with when running my application.


Never Miss Another Tech Innovation

Concrete insights and actionable resources delivered straight to your inbox to boost your developer career.

My New ebook, Best Practices To Create A Backend With Spring Boot 3, is available now.

Best practices to create a backend with Spring Boot 3

Leave a comment

Discover more from The Dev World - Sergio Lema

Subscribe now to keep reading and get access to the full archive.

Continue reading