Skip to content

OpenInference Serve

The serve action deploys an openinference function as an inference endpoint on Kubernetes. A Task is created by calling run() on the Function; task parameters are passed through that call.

Overview

OpenInference functions are specialized Python handlers for model-serving scenarios. They define a model name and tensor schemas directly in the function specification, making the endpoint contract explicit.

Quick example

function = dh.new_function(
    name="my-openinference-function",
    kind="openinference",
    code_src="inference.py",
    handler="predict",
    python_version="PYTHON3_10",
    model_name="text-classifier",
    inputs=[{"name": "input-0", "shape": [-1], "datatype": "BYTES"}],
    outputs=[{"name": "output-0", "shape": [-1], "datatype": "FP32"}]
)

run = function.run(
    action="serve",
    replicas=1,
    service_type="ClusterIP"
)

Parameters

Function Parameters

Must be specified when creating the function.

Name Type Description
project str Project name. Required only when creating from the library; otherwise MUST NOT be set.
name str Name that identifies the object. Required.
kind str Function kind. Must be openinference. Required.
uuid str Object ID in UUID4 format.
description str Description of the object.
labels list[str] List of labels.
embedded bool Whether the object should be embedded in the project.
code_src str URI pointing to the source code.
code str Source code provided as plain text.
base64 str Source code encoded as base64.
handler str Function entrypoint.
init_function str Init function name for remote execution.
python_version str Python version to use. Required.
lang str Source code language (informational).
image str Container image used to execute the function.
base_image str Base image (name:tag) used to build the execution image.
requirements list List of pip requirements to install into the execution image.
model_name str Logical model name exposed by the function.
inputs list[dict] Tensor definitions for the request payload.
outputs list[dict] Tensor definitions for the response payload.

Python Versions

The Python runtime supports versions 3.10, 3.11, 3.12, and 3.13 expressed as:

  • PYTHON3_10
  • PYTHON3_11
  • PYTHON3_12
  • PYTHON3_13

Init Function

The init function is the entrypoint used by the Nuclio init wrapper. Specify the init function name via the init_function parameter.

Base Image

The base image is the image (name:tag) used as the foundation when building the execution image for the function.

Requirements

Requirements are a list of strings representing packages to be installed by pip in the image where the function will be executed.

Tensor Schema

Each item in inputs and outputs is a tensor definition with the following fields:

Field Type Description
name str Tensor name.
shape list[int] Tensor shape.
datatype str Tensor datatype. Defaults to FP32.

Supported tensor datatypes are: BOOL, BYTES, UINT8, INT8, UINT16, INT16, UINT32, INT32, UINT64, INT64, FP16, FP32, and FP64.

Task Parameters

Can only be specified when calling function.run().

Name Type Description
action str Task action. Required. Must be serve
volumes list[dict] List of volumes.
resources dict Resource limits/requests.
envs list[dict] Environment variables.
secrets list[str] List of secret names.
profile str Profile template.
replicas int Number of replicas.
service_type str Kubernetes service type.
service_name str Name assigned to the created service.

Run Parameters

Can only be specified when calling function.run().

Name Type Description
init_parameters dict Parameters supplied to the init function.

Invocation payloads

The request body should follow the tensor schema defined by the function, for example:

{
  "inputs": [
    {
      "name": "input-0",
      "shape": [-1],
      "datatype": "BYTES",
      "data": ["hello world"]
    }
  ]
}