Skip to content

Quick Start

To start with DigitalHub, the first step is to install the platform and all its components. For its functionality, DigitalHub relies on Kubernetes, a state-of-art Open-Source containerized application deployment, orchestration and execution platform. While it is possible to run DigitalHub on any Kubernetes installation, the quickest way is to deploy it on Minikube, a local Kubernetes environment with minimal settings. See here instruction on how to set up DigitalHub on Minikube.

Once installed, you can access different platform components and perform different operations, ranging from exlorative data science with Jupyter Notebooks, creating projects for data processing or ML tasks, managing necessary resources (e.g., databases or datalake buckets), creating and running different functions, etc.

Platform Components and Functionality

To access the different components of the platform start from the landing page, where the components are linked:

  • Use Coder to create interactive workspaces, such as Jupyter Notebooks, to perform explorative tasks, access and manage the data. See how to use Workspaces for these type of activities.
  • Use DH Core UI to manage your data science and ML project and start with management activities, such as creating data items, defining and executing different functions and operations. Please note that these tasks may be done directly with the DH Core Python SDK from your interactive environment. See how to use DH Console for the management operations.
  • To see and manage the relevant Kubernetes resources (e.g., services, jobs, secrets), as well as custom resources of the platform (e.g., databases, S3 buckets, data services), use Kubernetes Resource Manager. The operations and the functionality of the tool are described in the Resource Management with KRM section of the documentation.
  • Use Minio browser to navigate your datalake, upload and manage the files. The datalake is based on S3 protocol and can be used also programmatically. See the Data and Transformations section on how the data abstraction layer is defined and implemented.
  • If you perform ML task with the Python runtime, you can prepare data, create and log ML Models using DH Core (see, e.g., Python Runtime if you want to use MLRun operations through DHCore). Alternatively, it is possible to use MLRun subsystem of the platform, MLRun UI provides you the information of the data, models, jobs and services operated with MLRun. See MLRun documentation on how to use MLRun directly.
  • Use Nuclio Serverless platform to deploy and expose functions as services within the platform. Nuclio is also used by MLRun to serve its ML Models as run-time operations. See Nuclio documentation on how to use Nuclio in different scenarios.
  • It is possible to organize the data and ML operations in complex pipelines. Currently the platform relies on Kubeflow Pipelines component for this purpose, orchestrating the activities as single Kubernetes Jobs. See more on this in the corresponding Pipelines section.

Tutorials

Start exploring the platform through a series of tutorials aiming at explaining the key usage scenarios for DigitalHub platform. Specifically