Skip to content

Run

The Run object is created by calling run() on a Function. Run-level parameters are provided alongside task parameters.

Parameters (Shared)

No shared specific parameters for run of this runtime.

Function Kind-Specific Parameters

KubeAI Text & Speech

Name Type Description
env dict Environment variables.
args list[str] Arguments.
cache_profile str Cache profile.
files list[KubeaiFile] Files.
scaling Scaling Scaling parameters.
processors int Number of processors.

Files

Files is a list of dict with the following keys:

files = [
    {
        "path": "file-path"
        "content": "file-content"
    }
]

Scaling

Scaling is a Scaling object that represents the scaling parameters for the run. Its structure is as follows:

scaling = {
    "replicas": int,
    "min_replicas": int,
    "max_replicas": int,
    "autoscaling_disabled": bool,
    "target_request": int,
    "scale_down_delay_seconds": int,
    "load_balancing": {
        "strategy": str,  # "LeastLoad" or "PrefixHash"
        "prefix_hash": {
            "mean_load_factor": int,
            "replication": int,
            "prefix_char_length": int
        }
    }
}

Methods

Once the run is created, you can access its attributes and methods through the run object.

invoke

Invoke served model. By default it exposes infer v2 endpoint.

Parameters:

Name Type Description Default
model_name str

Name of the model.

None
method str

Method of the request.

'POST'
url str

URL of the request.

None
**kwargs dict

Keyword arguments to pass to the request.

{}

Returns:

Type Description
Response

Response from the request.