Stars
Contrib repository for the OpenTelemetry Collector
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
Context OpenTelemetry Collector processor
The Triton backend for the ONNX Runtime.
Splits Keras with Tensorflow backends into two or more submodels.
DSPy: The framework for programming—not prompting—language models
ONNXMLTools enables conversion of models to ONNX
onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
Set of tools to assess and improve LLM security.
Master programming by recreating your favorite technologies from scratch.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
The Triton TensorRT-LLM Backend
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
🐍 A Python lib for (de)serializing Python objects to/from JSON
Large Language Model Text Generation Inference
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
Transformer related optimization, including BERT, GPT
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Follow along with our tutorials to get you up and running with Snowflake.