Skip to content

Deploy clusters

A cluster in exalsius is a managed Kubernetes cluster composed of nodes from your node pool. Each cluster has a control plane managed by the exalsius backend and a set of worker nodes that run AI workloads.

Deploy a cluster

Run exls clusters deploy without arguments to start the guided flow:

exls clusters deploy

The CLI walks you through:

  1. Cluster name — enter a name or accept the generated default.
  2. Worker nodes — select one or more nodes from your pool.
  3. LLM inference environment — optionally prepare the cluster for LLM inference deployments based on llm-d.

After reviewing a summary, confirm to start the deployment.

Pass all options as flags for scripted or CI/CD workflows:

exls clusters deploy \
  --name "my-exalsius-cluster" \
  --worker-nodes "a959a49e-..." "aec275e9-..." \
  --prepare-llm-inference-environment \
  --follow

The --follow flag streams deployment logs in real time.

Deployment lifecycle

After creation, a cluster transitions through these states:

  1. PENDING — the request has been received and is queued.
  2. DEPLOYING — exalsius is provisioning Kubernetes, configuring GPU drivers, and installing dependencies.
  3. READY — the cluster is ready to use.
  4. DELETING — cluster resources are being removed.

Deployment typically takes 10–15 minutes depending on network speed.

Check cluster status:

exls clusters list

During deployment, exalsius automatically:

  • Installs a comprehensive observability stack that collects metrics, logs, and traces. See the observability tutorial for details.
  • Sets up a peer-to-peer WireGuard VPN between all cluster nodes for secure, encrypted communication across networks.

Required firewall port

All nodes must have UDP port 51871 open for WireGuard VPN connections. Cluster deployment will fail if nodes cannot communicate on this port. See firewall configuration for details.

Manage cluster nodes

Add nodes to a cluster

exls clusters add-nodes <CLUSTER-ID-or-NAME> --nodes <NODE-ID-or-NAME> [<NODE-ID-or-NAME> ...]

Use --interactive to select nodes from a list instead of specifying IDs or names.

Remove nodes from a cluster

exls clusters remove-nodes <CLUSTER-ID-or-NAME> --nodes <NODE-ID-or-NAME> [<NODE-ID-or-NAME> ...]

Use --interactive to select worker nodes from a list. If no --nodes are provided, interactive mode starts automatically.

View cluster resources

Check the available GPU, CPU, and memory resources on a cluster:

exls clusters show-available-resources <CLUSTER-ID-or-NAME>

Stream cluster logs

Stream real-time Kubernetes events for a cluster:

exls clusters logs <CLUSTER-ID-or-NAME>

Use --json to output raw NDJSON instead of formatted display.

Access the monitoring dashboards

Retrieve the URL of the built-in monitoring dashboards:

exls management get-dashboard-url

Add --open to open it directly in your browser.

Export kubeconfig

Clusters are standard Kubernetes clusters. Export the kubeconfig to use kubectl or other Kubernetes tooling:

exls clusters import-kubeconfig <CLUSTER-ID-or-NAME>

By default, the kubeconfig is written to ~/.kube/config. Specify a custom path with --kubeconfig-path:

exls clusters import-kubeconfig --kubeconfig-path ~/.kube/exalsius-config <CLUSTER-ID-or-NAME>

Delete a cluster

exls clusters delete <CLUSTER-ID-or-NAME>

The CLI prompts for confirmation. Use --yes to skip the confirmation prompt.

Deletion may take some time. The cluster remains in DELETING state until all components are removed.

Warning

  • Deleting a cluster permanently removes it.
  • All workloads, pods, and workspaces running on it will be terminated.
  • Back up any data you need before deleting.

Next steps

Once your cluster is ready, deploy workspaces — Jupyter, Marimo, dev pods, or LLM inference.