This document describes the architecture of CoCalc in Kubernetes. It’s helpful to know a bit about the services it is composed of, but it’s not necessary to understand everything in every detail.
Here is a high-level overview:
Database: holds all the information about users, projects, and everything else.
Storage: holds all files edited and published by users via their projects.
Hub services: the entry points for users and projects from the internet. They are responsible for authentication, authorization, and routing.
Project pods: the actual containers running the user projects. These project pods are managed by a set of microservices called
Manage, which are responsible for starting, stopping, and monitoring projects.
In a nutshell, this is what they’re doing:
Some of them will talk to the database → Database,
A few services will dynamically create and delete pods (aka “CoCalc Projects”) inside that cluster → Manage,
and many of these pods will mount directories of a shared filesystem → Storage.
After you have a basic understanding of the architecture, you can continue preparing your cluster.
hub-websocket: clients from the web connect via websockets. This service also controls the database, etc.
During normal usage, it is expected that up to 50 simultaneous connections are possible, 30 are great.
It’s fine to run 5 or more websocket hubs.
hub-proxy: establishes a connection to the projects, requires a client with a valid authentication.
During normal usage, at least 50 simultaneous connections are possible.
It’s fine to run 5 or more proxy services.
Serves dynamic pages like the landing page at
This service also renders shared files of users at
/shareor even a custom name at
/[user name]/[project nickname]/[share nickname].
Finally, it also serves
There should be at least two next services running.
hub-maintenance-*: single pods, remove/compact data in the database.
hub-stats: single pod, collects statistics about CoCalc itself
hub-api: serves the
Restart an aspect e.g. via
k delete pod -l run=hub-next.
Restart all hubs via
k delete pod -l group=hub.
manage-action: triggered when a project is told to start, stop, restart, etc. Usually, the user is requesting to start a project, which is recorded in the database as a request to start the project. This microservice uses Postgres’s LISTEN/NOTIFY capability to listen for and react to such requests and actually starts the project. Behind the scenes,
manage-actionnot only reads the database, but also the
project-imageConfigMaps to determine what to do. After compiling all the available information, it sends a
Podconfiguration to the Kubernetes API server, which then starts the project.
manage-state: this service listens to the k8s API for changes regarding projects and updates the database accordingly. It’s the companion of
manage-action. In particular, once a project pod is started, it will update the database. This tells the user that the project is running and shortly it will connect. Also, the hub services will connect to that project to establish a communication channel.
manage-idle: periodically check if projects are idle and stop them. This also checks if there are any stopped “always running” projects that should be running and starts them, and checks if projects are stuck in pending, and stops them (cleanup).
manage-copy: this watches the
copy_pathstable of the database for requests to copy files, then starts projects if necessary and issues copy operations between projects, and finally writes out the status to the database. This basically issues rate-limited
manage-share: similar to
manage-copy, but for shared files.
Restart all manage services via
k delete pod -l group=manage or one
by one via
k delete pod -l run=manage-action, …
User projects run as pods.
You can think of them as a container running a Linux environment with an unprivileged user inside.
Each project has their own
$HOME-directory on a shared filesystem.
Overall, these project pods will use up most of the resources,
because the services mentioned above scale with a much smaller factor in the number of users.
Their resource requests and limits are configured via quota settings (only admins can do that), or via “licenses”. This means there could be projects requesting a significant chunk of available CPU or memory resources.
For this CoCalc setup, the “request” is calculated from the limits via
an over-commit ratio. This is set via the global site configuration
global.settings.default_quotas or that same filed in
Admin → Site Settings. The parameter
cpu_oc: 10 means the cpu
over-commit ratio is 1:10 – which is fine for interactive use, because
most of the time projects wait for user input. Similarly
means the memory over-commit ratio is 1:5.
You have to plan/adjust the size and number of nodes to match the overall requests for projects.
Users are sensitive to interrupted projects, because they can’t continue working and their intermediate state in e.g. notebooks is lost. Hence you can’t just willy-nilly delete projects.
Users are also sensitive to slow startup times. That’s why the Prepull service exists, pulling the large project images before marking the node ready for running these projects.
You can also partition the cluster heterogeneously, such that some projects run only on specific nodes, while all other projects end up in a common pool of project nodes.
All three aspects mentioned above are using storage in the form of a shared filesystem. Usually, this is accomplished via an NFS server, but there are other options as well. The Kubernetes abstraction for this is a PersistentVolume (PV) with ReadWriteMany access mode.
Projects mount the
/projects/[UUID] subdirectory as their Home Directory.
manage-copy mount this directory to copy between projects,
while sharing (publishing) files is mounted by
hub-next and serves rendered files at the
An important detail is that the UID/GUI is
This is for security reasons and to be distinct from
For example, the AWS EKS setup does not work out of the box
and must be configured to use
2001 as UID/GUI.
It’s highly recommended to run all project pods on their dedicated VMs (via node taints), because users – even by accident – could be using a lot of RAM and/or CPU. So, even if containers do their jobs, there might be issues and this cleanly separates the projects from the system services.
To enable this, look into the
values.yaml file, in
Below are labels and taints for service and project nodes.
Related to the above, there is also a “prepull” service. It solves the issue of users facing a project in a “Pending” state for too long. This happens because the images of the project pods are very large and take some time to load on a new node.
The basic idea is to initially configure new project nodes via Taints to not be able to run projects. Prepulls loads the large project image first, before any project pod can be scheduled on a new pod. When it was successful, it does a quick check and changes the taint of the node it runs on, such that project pods can be scheduled on that node. This in turn removes itself, because of the taint configuration. Projects will now start quickly, because the large project image is already loaded.
When there is an update to the project image (new tag in
manage.project.tag), the labels and taints of project nodes are
reset, because of a post update Deployment Hook (which in turn runs
The prepull service will then pull the new project image and once done, allows projects to schedule.
Projects that were already running before the updated are not affected.
You can get a sense about what image they run by checking their
project_tag label (or even delete old projects via
k delete pod -l run=project,project_tag=<old-tag> in order to get
rid of these pods, which then allows kubelet to remove those old Docker
images and avoid running into disk pressure issues).
The prepull service needs cluster-wide permissions,
because it must be able to modify the labels and taints of the nodes.
Feel free read through
cocalc/charts/manage/prepull.py in case you want to know what it
does – it’s pretty simple, but since it has cluster-wide permissions,
you might want to audit it.
To make use of this prepull service, you need two node pools. If you go ahead with the default names, configure the pools like this:
set the Kubernetes label to
set the Kubernetes label to
and the initial Kubernetes taint (this is
key=value → effect) to:
This service just holds static files, which build up the front-end
application – it’s served at
/app and clients connect to the backed
via a websocket served by
hub-websocket. Overall, this is probably
the service which will be updated most often.
If enabled (via
global.ssh_gateway.enabled), this service runs an
SSH server as a gateway to access projects. Users can add their public
SSH key to a project or their account. With that, they’re allowed users
to ssh inside a running project.
Use cases are:
simplifying running tasks, like periodic checks, etc.
up- or downloading files via
accessing scripts/software hosted on CoCalc from within another headless system, e.g. a cluster
Network setup: the connection to the outside world works by exposing the service’s endpoint. This is a “global” setup of your cluster, hence it is outside the scope of CoCalc’s HELM chart.
For the NGINX ingress controller, the TCP service of
ssh-gateway must be added to the tcp-services configmap.
If you’re using its HELM chart,
/ingress-nginx/values.yaml for a working example, under
If you already had setup a
LoadBalancer and update, it might
not pick up the new configuration to include port
22. The easiest
way to fix this is to delete the
LoadBalancer and let it be
If enabled (via
global.datastore.enabled), in the Project Settings a
configuration panel “Cloud storage & remote filesystems” appears. This
allows users to mount remote filesystems into the particular project.
This supports SSHFS, AWS S3 and Google Cloud Storage.
Under the hood, “Datastore” is a sidecar for the project, which mounts
these filesystems according to their configuration in
name is the name of the datastore). This mountpoint is
propagated to the project container from the host. If the file
~/data is not taken, the project will automatically create a symlink
to that global directory. Therefore, collaborators of the project can
use and see this filesystem, but they do not know the secret, don’t see
the raw configuration files, and also cannot interact with the actual
process doing the FUSE mount. The “secret” is hidden in the user
interface, it’s not sent to the web client.
The “read-only” mode enabled the
romount option for the FUSE mount.
To make the filesystem perform well, it does a bit of caching, but only with a small timeout. This means if you give it a few seconds to read/write sync, it’s possible to do a bit of collaboration via the same mounted filesystem. It’s not really recommended, but possible. Also note that there is filesystem level polling of discovered directories in CoCalc’s projects, which means that remote changes to these files will eventually show up as well and update in an opened editor. Those projects are also cached on CoCalc’s side.
Requests to support other remote filesystems are welcome, and if there is a robust tool and a way to easily configure them, we certainly consider adding it.
Pro-tip: if a project is set to “Always Running”, you can use the SSHFS configuration in combination with the SSH Gateway to mount a directory from another project. This is a bit of a hack, but it works.