Run
The Run
object is created by calling run()
on a Function. Run-level parameters are provided alongside task parameters.
Parameters (Shared)
No shared specific parameters for run of this runtime.
Function Kind-Specific Parameters
KubeAI Text & Speech
Name | Type | Description |
---|---|---|
env | dict | Environment variables. |
args | list[str] | Arguments. |
cache_profile | str | Cache profile. |
files | list[KubeaiFile] | Files. |
scaling | Scaling | Scaling parameters. |
processors | int | Number of processors. |
Files
Files is a list of dict with the following keys:
Scaling
Scaling is a Scaling
object that represents the scaling parameters for the run. Its structure is as follows:
scaling = {
"replicas": int,
"min_replicas": int,
"max_replicas": int,
"autoscaling_disabled": bool,
"target_request": int,
"scale_down_delay_seconds": int,
"load_balancing": {
"strategy": str, # "LeastLoad" or "PrefixHash"
"prefix_hash": {
"mean_load_factor": int,
"replication": int,
"prefix_char_length": int
}
}
}
Methods
Once the run is created, you can access its attributes and methods through the run
object.
invoke
Invoke served model. By default it exposes infer v2 endpoint.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
Name of the model. |
None
|
method
|
str
|
Method of the request. |
'POST'
|
url
|
str
|
URL of the request. |
None
|
**kwargs
|
dict
|
Keyword arguments to pass to the request. |
{}
|
Returns:
Type | Description |
---|---|
Response
|
Response from the request. |