Backup and Restore#

Backup#

Warning

Out of the box, CoCalc Cloud does not provide you with any backups. You have to do this yourself.

Note

There is a feature called TimeTravel, which records all changes to editable files and allows users to restore them. However, this is not a “real” backup, because all such changes are stored in the database.

There are three aspects to take care of:

  1. Backup of your my-values.yaml file. If you store this in a Git repository, push it to a private respository.

  2. Database: there are many ways how to backup a PostgreSQL database.

    • PostgreSQL Guide: Backup and Restore.

    • A solid way is pgBackRest.

    • If you use a HELM chart to deploy it, there might already be support for backups.

    • Managed SQL services in public clouds do provide backups for you as well, e.g. GCP SQL Backup.

  3. Storage: this is a filesystem shared across all projects and some services. You have to backup the underlying disk holding all the data. A modern and robust way is to make period Snapshots. Managed clouds provide snapshots, e.g. GCP Disk Backup.

Restore#

If you have the above, you should be able to restore everything.

  1. my-values.yaml: get your previous my-values.yaml file.

  2. Database: restore it from a backup. Check that you can sign in using the configured host and user, and that the stored Secret in Kubernetes still works as well. Use the db-shell.sh script – or something similar – to be sure you can access the restored database from within the namespace in your cluster.

  3. Files: this depends on how your type of storage, how it was setup, etc. Basically, there are two PersistentVolumeClaim, which will be used by the deployments – see Projects Data. If they were create automatically and no longer match with the old ones, you have to copy over the file from the old location to the new location. This should be really easy, though:

    • The PVC for projects has to subdirectories: projects and shared. The projects subdirectory has a directory for each project named after its ID. That’s essentially the data your users are most interested in. The shared directory contains published files, essentially a copy of files in specific projects.

    • The other PVC is for the /ext shared directory. It contains global software and data.