Start workspaces

A workspace is a workload or service that runs on a cluster — for example, a Jupyter notebook or an LLM inference server. Unlike reserving entire nodes, a workspace reserves specific resources (including GPUs) from the available cluster capacity. If a node has 2 GPUs, two single-GPU workspaces can run on the same node.

exalsius provides workspace templates — pre-configured blueprints with customizable settings for each workspace type.

Prerequisites

A cluster in READY status (see deploy clusters)
Available resources (GPUs, CPU, memory) on the cluster

Deploy a workspace

exls workspaces deploy <workspace-type> [CLUSTER-ID-or-NAME] [OPTIONS]

Available workspace types:

Type	Description
`jupyter`	Jupyter notebook with GPU access
`marimo`	Marimo reactive notebook with GPU access
`dev-pod`	SSH-accessible development environment for VS Code, Cursor, or PyCharm
`llm-inference`	LLM inference server powered by vLLM with tensor parallelism

All deploy commands share this behavior:

Cluster selection — pass a cluster ID or name as an argument. If omitted, exalsius auto-selects your only cluster or prompts you to choose from multiple clusters.
Configuration editing — before deploying, the CLI asks if you want to review and edit the workspace configuration in your default editor. The defaults work for most use cases.
Confirmation — the CLI displays a deployment summary and asks for confirmation before proceeding.

Jupyter

exls workspaces deploy jupyter [CLUSTER-ID-or-NAME] [OPTIONS]

Flag	Short	Default	Description
`--password`	`-p`	prompted	Password for the Jupyter notebook (min. 6 characters)
`--num-gpus`	`-g`	`1`	Number of GPUs to allocate
`--name`	`-n`	auto-generated	Workspace name
`--wait-for-ready`	`-w`	`false`	Wait until the workspace is ready before returning

The CLI prompts for the password if --password is not provided. Access the notebook via the URL shown in the Access field.

Marimo

exls workspaces deploy marimo [CLUSTER-ID-or-NAME] [OPTIONS]

Flag	Short	Default	Description
`--password`	`-p`	prompted	Password for the Marimo notebook (min. 6 characters)
`--num-gpus`	`-g`	`1`	Number of GPUs to allocate
`--name`	`-n`	auto-generated	Workspace name
`--wait-for-ready`	`-w`	`false`	Wait until the workspace is ready before returning

Same as Jupyter — the CLI prompts for a password if not provided. Access via the URL in the Access field.

Dev pod

exls workspaces deploy dev-pod [CLUSTER-ID-or-NAME] [OPTIONS]

Flag	Short	Default	Description
`--ssh-password`	`-p`	—	Password for SSH access (min. 6 characters)
`--ssh-public-key`	`-k`	—	Path to a public key file for SSH access
`--num-gpus`	`-g`	`1`	Number of GPUs to allocate
`--name`	`-n`	auto-generated	Workspace name
`--wait-for-ready`	`-w`	`false`	Wait until the workspace is ready before returning

You must provide at least one of --ssh-password or --ssh-public-key. Both can be used together. Connect via SSH using the address in the Access field.

Compatible with VS Code, Cursor, and PyCharm remote development.

Pod storage

Dev pods use two kinds of storage:

Ephemeral storage for the running pod. By default, this is 50 GB. If the pod exceeds this limit, Kubernetes may evict or restart it, and data stored there can be lost.
Persistent storage mounted at /workspaces. Data stored there survives pod eviction.

LLM inference

exls workspaces deploy llm-inference [CLUSTER-ID-or-NAME] [OPTIONS]

Flag	Short	Default	Env var	Description
`--huggingface-token`	`-t`	prompted	`HUGGINGFACE_TOKEN`, `HF_TOKEN`	HuggingFace API token
`--model-name`	`-m`	prompted	—	HuggingFace model in `<repo>/<model>` format (e.g., `Qwen/Qwen3-1.7B`)
`--num-gpus`	`-g`	`1`	—	Number of GPUs (sets vLLM tensor parallelism)
`--name`	`-n`	auto-generated	—	Workspace name
`--wait-for-ready`	`-w`	`false`	—	Wait until the workspace is ready before returning

The HuggingFace token can also be provided as --hf-token. Both --huggingface-token and --model-name are prompted if not provided.

LLM inference environment required

The cluster must have the LLM inference environment enabled. Enable it during cluster deployment with --prepare-llm-inference-environment or via the interactive deployment flow.

Access a workspace

How you connect depends on the workspace type:

Jupyter, Marimo — open the URL from the Access field in your browser and enter the password you set during deployment.
Dev pod — connect via SSH using the address from the Access field.
LLM inference — send requests to the endpoint shown in the Access field.

Manage workspaces

List all workspaces, optionally filtered by cluster:

exls workspaces list [CLUSTER-ID-or-NAME]

Get details and access information for a specific workspace:

exls workspaces get <WORKSPACE-ID-or-NAME>

Delete a workspace:

exls workspaces delete <WORKSPACE-ID-or-NAME>

Warning

Deleting a workspace permanently removes it and terminates all running processes.
Any unsaved work or data will be lost.
Save or export important data before deleting.

Inbound port access

Your nodes must allow inbound TCP connections on the port assigned to your workspace. See firewall configuration below.

Firewall configuration

Configure firewall rules based on your deployment scenario. If your nodes are cloud VMs, use the provider's firewall settings to open the required ports.

exalsius uses WireGuard for encrypted node-to-node communication:

Port	Protocol	Purpose
51871	UDP	WireGuard VPN for peer-to-peer connections

Required port

UDP port 51871 must be open on all nodes for cluster communication.

Next steps

Serving a large language model — step-by-step LLM inference tutorial
Cluster observability — monitor metrics, logs, and traces