Skip to main content

Considerations for deployment on shared clusters

When deploying Adaptive Engine in a shared cluster where other workloads are running, there are a few best practices you can implement to enforce resource isolation:

Deploy Adaptive in a separate namespace

When installing the Adaptive Helm chart, you can do so in a separate namespace by passing the --namespace option. Example:
helm install adaptive \
  adaptive/adaptive \
  --values ./values.yaml
  --namespace adaptive-engine
You can also pass the --create-namespace if the namespace does not exist yet.

Use node selectors to schedule Adaptive on specific GPU nodes

You can use the harmony.nodeSelector value in values.yaml to schedule Adaptive Harmony only on a specific node group. For example, if you are deploying Adaptive on an Amazon EKS cluster, you might add:
harmony:
  nodeSelector: 
    eks.amazonaws.com/nodegroup: p5-h100

Dedicated GPU node tenancy

Although the Adaptive control plane can run on any node where there are available CPU and memory resources, it is recommended that Harmony is scheduled to request and take ownership of all of the GPUs available on each GPU-enabled node. Although you might have already made sure Adaptive Harmony is only scheduled on a designated GPU node group using the instructions in the step above, you might want to guarantee no other workloads can be scheduled on those nodes. To dedicate a set of GPU nodes for Adaptive Harmony, you can use a combination of:
  1. Adding a taint to the GPU nodes
  2. Adding a corresponding toleration to Harmony in the values.yaml of the Adaptive Helm Chart
To add a taint to a node, you can first run kubectl get nodes -o name to see all the existing node names, and then taint them as exemplified below (replacing node_name):
kubectl taint nodes node_name dedicated=adaptive-engine:NoSchedule
You can then add a matching toleration to Harmony in the values.yaml file (harmony.tolerations) which will allow it to be scheduled on the tainted nodes:
harmony:
  tolerations:
  - key: dedicated
    operator: Equal
    value: adaptive-engine
    effect: NoSchedule
You can find more about taints and tolerations in the official Kubernetes documentation.

Advanced configuration

Database SSL/TLS configuration

Adaptive Engine supports secure TLS connections between the database and control plane.

Basic setting

If your PostgreSQL database supports TLS, you can enforce encrypted connections by adding the parameter sslmode=require to your PostgreSQL connection string dbUrl in the Helm chart’s values.yaml file:
  dbUrl: "postgres://<user>:<password>@<host>/<db>?sslmode=require"
Although sslmode=require encrypts the database connection, it does not verify the server’s identity.

Server certificate verification

In order for the application to be able to verify the server certificate, you must set sslmode to verify-ca or -verify-full.
  • verify-ca will verify the server certificate
  • verify-full will verify the server certificate and also that the server host name matches the name stored in the server certificate
verify-full is the recommended option for maximum security. You will need to provide the application with a root certificate to make server certification possible. You can do so by following these steps:
  1. Download the db server certificate (if you’re using AWS RDS for example, refer to this page), for instance rds-ca-rsa2048-g1.pem
  2. Upload the pem file to your k8s cluster. As the certificate is non-critical, public information, it can uploaded as a ConfigMap
kubectl create configmap -n <namespace> db-ca --from-file=rds-ca-rsa2048-g1.pem
  1. Mount the file as a volume to the control plane deployment by editing values.yaml:
...

volumes:
  - name: db-ca
    configMap:
      name: db-ca

volumeMounts:
  - name: db-ca
    mountPath: /mnt/db-ca/
    readOnly: true
  1. Use the sslrootcert parameter to refer to the certificate in the PostgresDB connection url, specifying mountPath + filename:
  dbUrl: "postgres://<user>:<password>@<host>/<db>?sslmode=verify-full&sslrootcert=/mnt/db-ca/rds-ca-rsa2048-g1.pem"
Refer to the official documentation for SSL support on PostgresSQL for more information.