OpenInference Serve
The serve action deploys an openinference function as an inference endpoint on Kubernetes. A Task is created by calling run() on the Function; task parameters are passed through that call.
Overview
OpenInference functions are specialized Python handlers for model-serving scenarios. They define a model name and tensor schemas directly in the function specification, making the endpoint contract explicit.
Quick example
function = dh.new_function(
name="my-openinference-function",
kind="openinference",
code_src="inference.py",
handler="predict",
python_version="PYTHON3_10",
model_name="text-classifier",
inputs=[{"name": "input-0", "shape": [-1], "datatype": "BYTES"}],
outputs=[{"name": "output-0", "shape": [-1], "datatype": "FP32"}]
)
run = function.run(
action="serve",
replicas=1,
service_type="ClusterIP"
)
Parameters
Function Parameters
Must be specified when creating the function.
| Name | Type | Description |
|---|---|---|
| project | str | Project name. Required only when creating from the library; otherwise MUST NOT be set. |
| name | str | Name that identifies the object. Required. |
| kind | str | Function kind. Must be openinference. Required. |
| uuid | str | Object ID in UUID4 format. |
| description | str | Description of the object. |
| labels | list[str] | List of labels. |
| embedded | bool | Whether the object should be embedded in the project. |
| code_src | str | URI pointing to the source code. |
| code | str | Source code provided as plain text. |
| base64 | str | Source code encoded as base64. |
| handler | str | Function entrypoint. |
| init_function | str | Init function name for remote execution. |
| python_version | str | Python version to use. Required. |
| lang | str | Source code language (informational). |
| image | str | Container image used to execute the function. |
| base_image | str | Base image (name:tag) used to build the execution image. |
| requirements | list | List of pip requirements to install into the execution image. |
| model_name | str | Logical model name exposed by the function. |
| inputs | list[dict] | Tensor definitions for the request payload. |
| outputs | list[dict] | Tensor definitions for the response payload. |
Python Versions
The Python runtime supports versions 3.10, 3.11, 3.12, and 3.13 expressed as:
PYTHON3_10PYTHON3_11PYTHON3_12PYTHON3_13
Init Function
The init function is the entrypoint used by the Nuclio init wrapper. Specify the init function name via the init_function parameter.
Base Image
The base image is the image (name:tag) used as the foundation when building the execution image for the function.
Requirements
Requirements are a list of strings representing packages to be installed by pip in the image where the function will be executed.
Tensor Schema
Each item in inputs and outputs is a tensor definition with the following fields:
| Field | Type | Description |
|---|---|---|
| name | str | Tensor name. |
| shape | list[int] | Tensor shape. |
| datatype | str | Tensor datatype. Defaults to FP32. |
Supported tensor datatypes are: BOOL, BYTES, UINT8, INT8, UINT16, INT16, UINT32, INT32, UINT64, INT64, FP16, FP32, and FP64.
Task Parameters
Can only be specified when calling function.run().
| Name | Type | Description |
|---|---|---|
| action | str | Task action. Required. Must be serve |
| volumes | list[dict] | List of volumes. |
| resources | dict | Resource limits/requests. |
| envs | list[dict] | Environment variables. |
| secrets | list[str] | List of secret names. |
| profile | str | Profile template. |
| replicas | int | Number of replicas. |
| service_type | str | Kubernetes service type. |
| service_name | str | Name assigned to the created service. |
Run Parameters
Can only be specified when calling function.run().
| Name | Type | Description |
|---|---|---|
| init_parameters | dict | Parameters supplied to the init function. |
Invocation payloads
The request body should follow the tensor schema defined by the function, for example: