At this point, we assume your cluster is running and there are a couple of projects started by users. However, some users ask you for more resources for their projects, or you want their projects to run on a dedicated subset of all nodes.
Licenses: defines an upgrade schema, that can be applied to one or more projects by your users, via applying a license to their project.
Quick’n’dirty: directly upgrade a project, without creating a license.
Heterogeneous nodes: distribute projects in a heterogeneous cluster, where some nodes are dedicated to specific users or groups.
GPU nodes: tell certain projects to run on nodes with GPUs.
Creating “Licenses” is a way to define the resource request for a specific project. For that, please open your “Admin” panel and expand the “Licenses” section. Then, click “Create license” to see a form for configuring a new license. (If you already have licenses, you can search for them and modify their parameters.)
Title: give it a name, it will be used to identify the license in the UI. This and the description will help your users to understand what this license is about to do.
License manager: search for the account of one or more users, who should see that license. They will be able to select it for assigning it to their projects (otherwise, they’ll have to know the ID)
Run limit: how many actively running projects this license can upgrade at the same time. If this is a course, and the license is distributed via the course management configuration, that limit must be at least the number of students, because each student has their own project.
Activates/Expires: define start and end dates – if no end date is given, it’s a perpetual license.
Quota: here, you can set resource parameters for the project Pod and other details:
Higher CPU and Memory limits. The associated resource requests will be computed based on the overcommit ratio specified in
Always Running: this is a neat feature for users, because it keeps some of their precious projects around. If you enable this, the project will be restarted, if it is stopped by the user or was on a node that has been decommissioned. This is useful for long-running calculations, or just to make the files immediately accessible without having to wait for the project to start up again. Also, the state of running session like in a Jupyter Notebook are not deleted.
Additionally, there are three special quotas for on-prem setups:
Please do not use “Upgrades”. This is a legacy feature.
If licenses are compatible, more than one can be active and the quotas add up. The overall limit is defined in
To keep things simple, advice your users to use only one license per project.
You can modify an existing license, which avoids users having to change the applied licenses.
To expire a license, change the expiration to be in the past. This will become effective, when the project is restarted.
Besides the structured approach of creating and distributing Licenses, you can also jump in with your powers as an Admin and directly upgrade a project.
For that, ask for the project’s UUID and then open
https://<your-domain>/projects/<UUID>/settings in your browser (which opens that project’s settings).
Then click the “Admin Quotas…” button in the “Project usage and quotas” section.
This reveals a panel, where you can set base upgrades for the project. They are complementary to the upgrades given by a license, i.e. they do not add up. Any changes require a restart of the project.
Of particular interest is probably raising memory (“Shared RAM”), increasing the “Idle Timeout”, or even setting it to “Always Running”.
This feature was added in version 2.11.0.
Imagine, you have several workgroups and they want to run their projects on their own dedicated set of nodes. Possible motivations are:
They want to have a certain amount of (possibly very large) resources available at all times,
They want to have a certain type of hardware, e.g. with GPUs,
They pay for the specific hardware and want to make sure, that only their projects run on it.
CoCalc Cloud is a single system, but you can partition your cluster in such a way, that some machines are dedicated to specific users or groups.
The idea is to add taints and labels, with a specific “name”, to certain nodes in your cluster – as explained below. Then create a license, which encodes the name of these dedicated machines and resource quotas.
Important: The taint and label must be named in the same way and compatible with the kubernetes naming schema. Also, don’t get confused with the “node names”!
Decide on a name for your group of one or more Dedicated VM(s) – this is the common
All nodes in your kubernetes cluster, which should be part of this group, have their own distinct node name:
With that, for each node the following must be set:
kubectl taint nodes [node-name] cocalc-dedicated_vm=[taint_name]:NoSchedule kubectl label nodes [node-name] cocalc-dedicated_vm=[taint_name]
e.g. if your nodes are
vm002 and you name that group of Dedicated VMs
kubectl taint nodes vm001 cocalc-dedicated_vm=foo:NoSchedule kubectl label nodes vm001 cocalc-dedicated_vm=foo kubectl taint nodes vm002 cocalc-dedicated_vm=foo:NoSchedule kubectl label nodes vm002 cocalc-dedicated_vm=foo
Finally, to create the corresponding license:
Open Admin → Site Licenses….
Create a new license.
[taint-name]of the Dedicated VM in the text field (e.g.
foo) for the Dedicated VM name.
Configure RAM and CPU quotas as well.
Save the license and double-check the configuration in the shown quota JSON object.
Send the license key to those users of yours, who should be allowed to run their projects on that machine.
Once they add that license key to their project, it will restart, and the management service will outfit that project pod with the corresponding taint toleration and enforce running that pod on a node with a matching label.
This section is work in progress and only describes the basic idea.
If you have nodes with one or more GPUs, you can forge special licenses, which request a GPU for the project, where such a license has been applied.
What you need to know is how a project pod must be configured, in order to request a GPU in your cluster. That change is can be defined via a Patch.