Konduit Serving
Projects | | Links: Website | Source
Konduit Serving is a serving system and framework focused on deploying machine learning pipelines to production. The core abstraction is an idea called a “pipeline step”. An individual step is meant to perform a task as part of using a machine learning model in a deployment scenario. These steps generally include:
- Pre- or post-processing steps
- One or more machine learning models
- Transforming the output in a way that can be understood by humans, such as labels in a classification example.
For instance, if you want to run arbitrary Python code for pre-processing purposes, you can use aPythonStep
. To perform inference on a (mix of) TensorFlow, Keras, Deeplearning4j (DL4J) or PMML models, use ModelStep
.
Konduit Serving also contains functionality for other pre-processing tasks, such as DataVec transform processes or image transforms.
Why Konduit Serving
Like Seldon or MLflow, Konduit Serving provides building blocks for developers to write their own production machine learning pipelines from pre-processing to model serving, exposable as a REST API.
Konduit Serving was built with the goal of providing proper low level interoperability with native math libraries such as TensorFlow and our very own DL4J’s core math library libnd4j.
At the core of Konduit Serving are the JavaCPP Presets, Vert.x and DL4J for running Keras models in Java.
Combining JavaCPP’s low-level access to C-like APIs from Java with Java’s robust server side application development (Vert.x on top of netty) allows for better access to faster math code in production while minimizing the surface area where native code = more security flaws (mainly in server side networked applications). This allows us to do things like in zero-copy memory access of NumPy arrays or Arrow records for consumption straight from the server without copy or serialization overhead.
When dealing with deep learning, we can handle proper inference on the GPU (batching large workloads).
Extending that to Python SDK, we know when to return a raw Arrow record and return it as a pandas DataFrame!
We also strive to provide a Python-first SDK that makes it easy to integrate Konduit Serving into a Python-first workflow.
Optionally, for the Java community, a Vert.x-based model server and pipeline development framework allow a thin abstraction that is embeddable in a Java microservice.
We want to expose modern standards for monitoring everything from your GPU to your inference time. Visualization can happen with applications such as Grafana or anything that integrates with the Prometheus standard for visualizing data.
Finally, we aim to provide integrations with more enterprise platforms typically seen outside the big data space.