Resource Management#

At this point, we assume your cluster is running and there are a couple of projects started by users. However, some users ask you for more resources for their projects, or you want their projects to run on a dedicated subset of all nodes.

  • Licenses: defines an upgrade schema, that can be applied to one or more projects by your users, via applying a license to their project.

  • Quick’n’dirty: directly upgrade a project, without creating a license.

  • Heterogeneous clusters: distribute projects in a heterogeneous cluster, where some nodes are dedicated to specific users or groups.

  • GPU nodes: tell certain projects to run on nodes with GPUs.


Creating “Licenses” is a way to define the resource request for a specific project. For that, please open your “Admin” panel and expand the “Licenses” section. Then, click “Create license” to see a form for configuring a new license. (If you already have licenses, you can search for them and modify their parameters.)

  • Title: give it a name, it will be used to identify the license in the UI. This and the description will help your users to understand what this license is about to do.

  • License manager: search for the account of one or more users, who should see that license. They will be able to select it for assigning it to their projects (otherwise, they’ll have to know the ID)

  • Run limit: how many actively running projects this license can upgrade at the same time. If this is a course, and the license is distributed via the course management configuration, that limit must be at least the number of students, because each student has their own project.

  • Activates/Expires: define start and end dates – if no end date is given, it’s a perpetual license.

  • Quota: here, you can set resource parameters for the project Pod and other details:

    • Higher CPU and Memory limits. The associated resource requests will be computed based on the overcommit ratio specified in global.settings.default_quotas settings parameter.

    • Always Running: this is a neat feature for users, because it keeps some of their precious projects around. If you enable this, the project will be restarted, if it is stopped by the user or was on a node that has been decomissioned. This is useful for long-running calculations, or just to make the files immediately accessible without having to wait for the project to start up again. Also, the state of running session like in a Jupyter Notebook are not deleted.

  • There are two special quotas for on-prem setups:

  • Please do not use “Upgrades”. This is a legacy feature.


  • If licenses are compatible, more than one can be active and the quotas add up. The overall limit is defined in global.settings.max_upgrades.

  • To keep things simple, advice your users to use only one license per project.

  • You can modify an existing license, which avoids users having to change the applied licenses.

  • To expire a license, change the expiration to be in the past. This will become effective, when the project is restarted.


Besides the structured approach of creating and distributing Licenses, you can also jump in with your powers as an Admin and directly upgrade a project. For that, ask for the project’s UUID and then open https://<your-domain>/projects/<UUID>/settings in your browser (which opens that project’s settings). Then click the “Admin Quotas…” button in the “Project usage and quotas” section.

This reveals a panel, where you can set base upgrades for the project. They are complementary to the upgades given by a license, i.e. they do not add up. Any changes require a restart of the project.

Of particular interest is probably raising memory (“Shared RAM”), increasing the “Idle Timeout”, or even setting it to “Always Running”.

Heterogeneous clusters#


This section is work in progress and only describes the basic idea.

Imagine, you have several workgroups and they want to run their projects on their own dedicated set of nodes. Hence, you have to create an heterogeneous set of project nodes and manage access to them.

The idea is to add additional taints to each such pool of dedicated project nodes, such that a standard project is not allowed to run on them.

Licenses – as explained above – can modify the project pod with a Patch. To force a project to only run on the dedicated nodes, and avoid any other projects to run there, two things need to be done:

If you’re using the Prepull service, it needs those taint tolerations as well.

GPU nodes#

If you have nodes with one or more GPUs, you can forge special licenses, which request a GPU for the project, where such a license has been applied.

How this is setup in detail is beyond the scope of this guide. This requires setting up GPU support on these nodes, changing the container runtime, and customizing the project’s software image.

What you need to know is how a project pod must be configured, in order to request a GPU in your cluster. That change is can be defined via a Patch.