Releases: pytorch/serve
TorchServe v0.3.0 Release Notes (Beta)
This is the release of TorchServe v0.3.0
Highlights:
- Native windows support - Added support for TorchServe on Windows 10 pro and Windows Server 2019
- KFServing Integration - Added support for v1 KFServing predict and explain APIs with auto-scaling and canary deployments for serving models in Kubeflow/KFServing
- MLFlow-TorchServe: New MLflow TorchServe deployment plugin for serving models for MLflow MLOps lifecycle
- Captum explanations - Added explain API for Captum model interpretability of different models
- AKS Support - Added support for TorchServe deployment on Azure Kubernetes Service
- GKE Support - Added support for TorchServe deployment on Google Kubernetes Service
- gRPC support - Added support for gRPC based management and inference APIs
- Request Envelopes - Added support for request envelopes which parses request from multiple Model serving frameworks like Seldon, KFServing, without any modifications in the handler code
- PyTorch 1.7.1 support - TorchServe is now certified working with torch 1.7.1, torchvision 0.8.2, torchtext 0.8.1, and torchaudio 0.7.2
- TorchServe Profiling - Added end-to-end profiling of inference requests. The time taken for different events by TorchServe for an inference request is captured in TorchServe metrics logs
- Serving SDK - Release TorchServe Serving SDK 0.4.0 on maven with contracts/interfaces for Metric Endpoint plugin and Snapshot plugins
- Naked DIR support - Added support for Model Archives as Naked DIRs with the
--archive-format no-archive
- Local file URL support - Added support for registering model through local file (
file:///
) URLs - Install dependencies - Added a more robust install dependency script certified across different OS platforms (Ubuntu 18.04, MacOS, Windows 10 Pro, Windows Server 2019)
- Link Checker - Added link checker in sanity script to report any broken links in documentation
- Enhanced model description - Added GPU usage info and worker PID in model description
- FAQ guides - Added most frequently asked questions by community users
- Troubleshooting guide - Added documentation for troubleshooting common problems related to model serving by TorchServe
- Use case guide - Provides the reference use cases i.e. different ways in which TorchServe can be deployed for serving different types of PyTorch models
Other PRs since v0.2.0
Bug Fixes:
- Fixed unbound variable issue while creating binaries from script #595
- Fixed model latency calculation logic #630
- Treat application/x-www-form-urlencoded as binary data #705
- Fix socket.send does not guarantee that all data will be send. #765
- Fixed bug in create_mar.sh script of Text_to_Speech_Synthesizer #704
- Docker fixes #709 #724 #642 #823 #839 #853 #880
- Unit and regression test fixes #774 #775 #827 #845 #858 #852
- Install scripts fixes #798 #837 #844 #836
- Benchmark fixes #768
- Dependency fixes #757 #820
- Temp path fixes #877 #638
- Migrate model urls #697 #696 #695
Others
- Added metrics endpoint to cfn templates and k8s setup #670 #747
- Environment information header in regression and sanity suite #622 #865 #863
- Documentation changes and fixes #754 #470 #816 #584 #872 #871 #879 #739
- FairSeq language translation example #592
- Additional regression tests for KFServing #855
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
Getting Started with TorchServe
Additionally, you can get started at https://pytorch.org/serve/ with installation instructions, tutorials and docs.
Lastly, if you have questions, please drop it into the PyTorch discussion forums using the ‘deployment’ tag or file an issue on GitHub with a way to reproduce.
TorchServe v0.2.0 Release Notes (Beta)
This is the release of TorchServe v0.2.0
Highlights:
- Kubernetes Support - Torchserve deployment in Kubernetes using Helm Charts and a Persistent Volume
- Prometheus metrics - Added Prometheus as the default metrics framework
- Requirements.txt support - Added support to specify model specific dependencies as a requirements file within a mar archive; Cleanup of unused parameters and addition of relevant ones for torch-model-archiver
- Pytorch Scripted Models Support - Scripted model versions added to model zoo; Added testing for scripted models
- Default Handler Refactor: (breaking changes) The default handlers have been refactored for code reuse and enhanced post-processing support. More details in Backwards Incompatible Changes section below
- Windows Support - Added support for torchserve on windows subsystem for Linux
- AWS Cloud Formation Support - Added support for multi-node AutoScaling Group deployment, behind an Elastic Load Balancer using Elastic File System as the backing store
- Benchmark and Testing Enhancements - Added models in benchmark and sanity tests, support for throughput with batch processing in benchmarking, support docker for jmeter and apache benchmark tests
- Regression Suite Enhancements - Added new POSTMAN based test cases for API and pytest based intrusive test cases
- Docker Improvements - Consolidated dev and codebuild dockerfiles
- Install and Build Script Streamlining - Unified install scripts, added code coverage and sanity script
- Python Linting - More exhaustive python linting checks across Torchserve and Model Archiver
Backwards Incompatible Changes
- Default Handler Refactor:
- The default handlers have been refactored for code reuse and enhanced post-processing support. The output format for some of the following examples/models has been enhanced to include additional details like score/class probability.
- The following default-handlers have been equipped with batch support. Due to batch support, resnet_152_batch example is not a custom handler example anymore.
- image_classifier
- object_detector
- image_segmenter
- The index_to_name.json file use for the class to name mapping has been standardized across vision/text related default handlers
- Refactoring and code reuse have resulted into reduced boilerplate code in all the
serve/examples
. - Custom handler documentation has been restructured and enhanced to facilitate the different possible ways to build simple or complex custom handlers
Other PRs since v0.1.1
Bug Fixes:
- Fixed
NameError
in default image_classifier handler #489 - Fixed timeout errors during build #420 and unit tests #493
- Fixed model loading error on cpu which was saved on gpu #444
- Fixed Snapshot not being emitted after unregistering model with no workers #491
- Inference API description conformant to OpenAPI #372
- Removed duplicate snapshot server property #318
- Fixed tag for latest CPU version in README #452
- Added check for no objects detected in object detector #447
- Fixed incorrect set up of default workers per model #513
- Fixed model-archiver to accept handler name or handler_name:entry_pnt_func combinations #472
Others
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
Getting Started with TorchServe
Additionally, you can get started at https://pytorch.org/serve/ with installation instructions, tutorials and docs.
Lastly, if you have questions, please drop it into the PyTorch discussion forums using the ‘deployment’ tag or file an issue on GitHub with a way to reproduce.
TorchServe v0.1.1 Release Notes (Experimental)
This is the release of TorchServe v0.1.1
Highlights:
- HuggingFace BERT Example - Support for HuggingFace Models demonstrated with examples under examples/ directory.
- Waveglow Example - Support for Nvidia Waveglow model demonstrated with examples under examples/ directory.
- Model Zoo - Model Zoo with model archives created from popular pre-trained models from PyTorch Model Zoo
- AWS Cloud Formation Support - Support added for spinning up TorchServe Model Server on an EC2 instance via the convenience of AWS Cloud Formation Template.
- Snakeviz Profiler - Support for Profiling TorchServe Python execution via snakevize profiler for detailed execution time reporting.
- Docker improvements - Docker image size optimization, detailed docs for running docker.
- Regression Test Suite - Detailed Regression Test Suite to allow comprehensive tests for all supported REST APIs. Automating this test helps faster regression detection.
- Detailed Unit Test Reporting - Detailed breakdown of Unit Test Reports from gradle build system.
- Installation Process Streamlining - Easier user onboarding with detailed documentation for installation
- Documentation Clean up - Refactored documentation with clear instructions
- GPU Device Assignment - Object Detection Model now correctly runs on multiple GPU devices
- Model Store Clean-up - Clean up Model store of all artifacts for a deleted model
Other PRs since v0.1.0
Bug Fixes:
- Fixes Incorrect Version number reporting #360
- Validation for correct port range 0-65535 #304
- Gradle build failures for new Gradle version-6.4 #352
- Standardize "Model version not found." response for all applicable Api's with Http status code 404. #282
- The
--model-store
should point to a user-relative directory. #248 - Corrected query parameter name in OpenApi description for registration api. #328
- psutil install de-duplication #329
- Maven tests should output only errors and not info / stack traces #326
- Fixed installation issues for Python VirtualEnv #341
Documentation
- Using GPU in Docker #205
Others
- Github Issue templates #273
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+
Getting Started with TorchServe
Additionally, you can get started at pytorch.org/serve with installation instructions, tutorials and docs.
Lastly, if you have questions, please drop it into the PyTorch discussion forums using the ‘deployment’ tag or file an issue on GitHub with a way to reproduce.
TorchServe v0.1.0
TorchServe (Experimental) v0.1.0 Release Notes
This is the first release of TorchServe (Experimental), a new open-source model serving framework under the PyTorch project (RFC #27610).
Highlights
-
Clean APIs - Support for an Inference API for predictions and a Management API for managing the model server.
-
Secure Deployment - Includes HTTPS support for secure deployment.
-
Robust model management capabilities - Allows full configuration of models, versions, and individual worker threads via command line interface, config file, or run-time API.
-
Model archival - Provides tooling to perform a ‘model archive’, a process of packaging a model, parameters, and supporting files into a single, persistent artifact. Using a simple command-line interface, you can package and export in a single ‘.mar’ file that contains everything you need for serving a PyTorch model. This `.mar’ file can be shared and reused. Learn more here.
-
Built-in model handlers - Support for model handlers covering the most common use-cases (image classification, object detection, text classification, image segmentation). TorchServe also supports custom handlers
-
Logging and Metrics - Support for robust logging and real-time metrics to monitor inference service and endpoints, performance, resource utilization, and errors. You can also generate custom logs and define custom metrics.
-
Model Management - Support for management of multiple models or multiple versions of the same model at the same time. You can use model versions to roll back to earlier versions or route traffic to different versions for A/B testing.
-
Prebuilt Images - Ready to go Dockerfiles and Docker images for deploying TorchServe on CPU and NVIDIA GPU based environments. The latest Dockerfiles and images can be found here.
Platform Support
- Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+
Known Issues
- The default object detection handler only works on cuda:0 device on GPU machines #104
- For torchtext based models, the sentencepiece dependency fails for MacOS with python 3.8 #232
Getting Started with TorchServe
- Additionally, you can get started at pytorch.org/serve with installation instructions, tutorials and docs.
- Lastly, if you have questions, please drop it into the PyTorch discussion forums using the ‘deployment’ tag or file an issue on GitHub with a way to reproduce.