Machine Learning Model Serving Tools
- 01. Neptune Metadata store for MLOps - Feel in control of your models and experiments by having all metadata organized in a single place.
- 02. BentoML is an open platform that simplifies ML model deployment and enables you to serve your models at production scale in minutes.
- 03. Cortex for Realtime model serving - Deploy ML models as realtime HTTP or gRPC APIs and seamlessly scale inference across CPU or GPU instances.
- 04. TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments.
- 05. TorchServe is a flexible and easy to use tool for serving PyTorch models.
- 06. KServe enables serverless inferencing on Kubernetes and provides performant, high abstraction interfaces for common ML frameworks like TensorFlow, XGBoost, scikit-learn, PyTorch, and ONNX to solve production model serving use cases
- 07. Multi Model Server (MMS) is a tool for serving deep learning models exported from MXNet or the Open Neural Network Exchange (ONNX)
- 08. NVIDIA Triton Inference Server deploys trained AI models from any framework (TensorFlow, NVIDIA TensorRT, PyTorch, ONNX Runtime) from local storage or cloud on GPU- or CPU-based infrastructure
- 09. ForestFlow is a scalable policy-based cloud-native machine learning model server for easily deploying and managing ML models.
- 10. DeepDetect is a deep learning API and server along with a pure Web Platform for training and managing models.
- 11. Seldon reduces time-to-value so models can get to work faster. Scale with confidence and minimise risk through interpretable results and transparent model performance.
- 12. An MLflow Model is a standard format for packaging ML models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark
- 13. OpenVINO Model Server (OVMS) is a high-performance system for serving machine learning models