ML Model Serving Tools

Machine Learning Model Serving Tools

01. Neptune Metadata store for MLOps - Feel in control of your models and experiments by having all metadata organized in a single place.
02. BentoML is an open platform that simplifies ML model deployment and enables you to serve your models at production scale in minutes.
03. Cortex for Realtime model serving - Deploy ML models as realtime HTTP or gRPC APIs and seamlessly scale inference across CPU or GPU instances.
04. TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments.
05. TorchServe is a flexible and easy to use tool for serving PyTorch models.
06. KServe enables serverless inferencing on Kubernetes and provides performant, high abstraction interfaces for common ML frameworks like TensorFlow, XGBoost, scikit-learn, PyTorch, and ONNX to solve production model serving use cases
07. Multi Model Server (MMS) is a tool for serving deep learning models exported from MXNet or the Open Neural Network Exchange (ONNX)
08. NVIDIA Triton Inference Server deploys trained AI models from any framework (TensorFlow, NVIDIA TensorRT, PyTorch, ONNX Runtime) from local storage or cloud on GPU- or CPU-based infrastructure
09. ForestFlow is a scalable policy-based cloud-native machine learning model server for easily deploying and managing ML models.
10. DeepDetect is a deep learning API and server along with a pure Web Platform for training and managing models.
11. Seldon reduces time-to-value so models can get to work faster. Scale with confidence and minimise risk through interpretable results and transparent model performance.
12. An MLflow Model is a standard format for packaging ML models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark
13. OpenVINO Model Server (OVMS) is a high-performance system for serving machine learning models