Using Kubernetes Resources for Runs
When it come to execution of AI tasks, either batch jobs or model serving, it is important to be able to allocate an appropriate set of resources, such as memory, GPU, node types, etc.
For this purpose the platform relies on Kubernetes functionalities and resource definitions. More specifically, the run configuration may have a specific requirements for
- node selection
- volumes (Persistent Volume Claims and Config Maps)
- HW resources in terms of CPU and memory requests and limits, numbers of GPUs
- Kubernetes affinity definition and/or toleration properties
- Additional secrets and environment variables to be associated with the execution
How to define resource requirements
In the platform, kubernetes resource requirements may be defined in two ways: * by users at run time, by requesting resources to be allocated for a given run, * by administrators at deployment time, by configuring defaults and limits along with pre-configured templates and profiles
Request resources at runtime
The platform lets users require additional k8s resource to be allocated to a given function's run, either via Core UI or via SDK. Please note that it is possible to describe only some properties, leaving the rest blank without constraints. All the defaults are managed by the platform in accordance with the underlying Kubernetes deployment.
To define requirements for single runs, developers need to include in the run specification the resource definition, in accordance with the schema.
For example, to request for a certain amount of compute resources, the spec must contain the detailed definition as follows:
resources:
cpu:
requests: 8
mem:
requests: 32Gi
gpu:
limits: "1"
In order to provide such definitions, users can leverage the SDK or the Core UI to programmatically or interactively define their request. Please see the Kubernetes Resources section of the documentation for more information.
Resource templates and profiles
It is possible to rely on a set of preconfigured HW profiles defined during the platform deployment. The profile allow for abstracting the platform users from the underlying complexity. Each profile corresponds to a specific resource configuration that defines a combination of requirements. For example, the profile may define a specific type of GPU, memory, and CPUs to be used. In this case it is sufficient to specify the corresponding profile name in the run execution configuration to allocate the corresponding resources.
The mechanism of profiles is described in the administration section of the documentation and is managed by the platform admins. Please see Resource templates section of the documentation for more information.
Please note that the requirements defined in the template have priority over those defined by the user and are not overwritten.
Available resources
The section lists all the resources available to users for runs.
Volumes
Users can ask for a persistent volume claim (pvc) to be created and mounted on the container being launched by the task.
You need to declare the volume type as persistent_volume_claim
, a name for the PVC for the user (e.g., my-pvc
), the mount path on the container and a spec with the size of the PV to be reserved.
The platform will create the volume and bind it to the pod lifecycle.
volumes:
- volume_type: "persistent_volume_claim",
name: "my-pvc",
mount_path: "/data",
spec:
size: "10Gi",
Note: the platform can be configured to block the usage of pre-existing volumes for security reasons. Volumes created by the platform for specific runs as ReadWriteOnce and used exclusively by the platform.
Hardware Resources
Users can request a specific amount of hardware resources (cpu, memory, gpu) for a given run by declaring them via the resources
spec parameter.
Supported resources are:
- CPU
- RAM memory
- GPU
CPU
To request a specific amount of CPU for the run, declare the resource type as cpu
and specify request and/or limit values.
resources:
cpu:
requests: "10"
limits: "12"
RAM memory
To request a specific amount of RAM memory for the run, declare the resource type as mem
and specify request and/or limit values.
resources:
mem:
requests: 32Gi
limits: 64Gi
GPU
To request GPU resources, specify the resource type gpu
and set the requested value as a limit.
resources:
gpu:
limits: "1"
Secrets
Users can request a secret injection into the run being launched by passing the identifier inside the secrets
field.
Secrets must be stored via the platform: externally defined secrets (for example in k8s) are not accessible to users for security reasons.
secrets:
- my-secret-key
Envs
User can inject environment variables injection into the container being launched by passing definition of variables as key/value inside the envs
field.
envs:
- name: ENV1
value: VALU123123
- name: ENV2
value: VALU123123
Node selection
Users can request a node selector for the run being launched by defining the selector(s) as a key/value list. The platform will add the selectors as-is to k8s resources such as Jobs, Pods, Deployments when appropriate.
node_selector:
- key: selectorKey
value: selectValue
See K8s Documentation for reference.
Tolerations
To define tolerations add the definition inside the tolerations
field of the spec, following Kubernetes specifications.
Please see Kubernetes documentation.
Affinity
To define affinity add the definition inside the affinity
field of the spec, following Kubernetes specifications.
Please see Kubernetes documentation.
FS group
To properly map volumes mounted for runs, users can specify the group id used for mount operations. This step is required when the USER used to run the process does not match the default.
Define the fs_group
field by specifying the group id as integer.
fs_group: 1000
Run as user
The process run inside the container is owned by the USER defined in the container manifest. For security reasons, the platform does not allow containers to be run as root.
User can ask for a different, specific user id to be used, by defining the run_as_user
field.
It accepts an integer value.
run_as_user: 1000
Run as group
The process run inside the container is owned by the GROUP defined in the container manifest. For security reasons, the platform does not allow containers to be run as root.
User can ask for a different, specific group id to be used, by defining the run_as_group
field.
It accepts an integer value.
run_as_group: 1000