Skip to content
View jalola's full-sized avatar

Block or report jalola

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reproduction of DeepSeek-R1

Python 167 17 Updated Mar 24, 2025

Contrib repository for the OpenTelemetry Collector

Go 3,448 2,654 Updated Mar 27, 2025

Context aware, pluggable and customizable data protection and de-identification SDK for text and images

Python 4,318 622 Updated Mar 26, 2025

Context OpenTelemetry Collector processor

Go 3 5 Updated Mar 20, 2025

The Triton backend for the ONNX Runtime.

C++ 140 62 Updated Mar 14, 2025

Splits Keras with Tensorflow backends into two or more submodels.

Jupyter Notebook 18 3 Updated Feb 20, 2023

DSPy: The framework for programming—not prompting—language models

Python 22,668 1,737 Updated Mar 28, 2025

LLM training in simple, raw C/CUDA

Cuda 26,162 3,005 Updated Oct 2, 2024

ONNXMLTools enables conversion of models to ONNX

Python 1,062 192 Updated Jan 8, 2025

onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime

C++ 370 95 Updated Mar 26, 2025

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

Python 465 78 Updated Mar 12, 2025

🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.

Shell 135 4 Updated Jul 25, 2024

Set of tools to assess and improve LLM security.

Python 2,983 501 Updated Feb 14, 2025

Master programming by recreating your favorite technologies from scratch.

Markdown 364,518 33,830 Updated Sep 3, 2024

Microsoft Automatic Mixed Precision Library

Python 586 49 Updated Sep 29, 2024

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 16,114 3,120 Updated Mar 28, 2025

The Triton TensorRT-LLM Backend

Python 812 118 Updated Mar 26, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 10,001 1,265 Updated Mar 28, 2025

🐍 A Python lib for (de)serializing Python objects to/from JSON

Python 292 40 Updated Dec 29, 2023

Large Language Model Text Generation Inference

Python 9,938 1,175 Updated Mar 28, 2025

Simplify your onnx model

C++ 4,021 394 Updated Sep 3, 2024

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

Python 1,681 151 Updated Oct 23, 2024

Transformer related optimization, including BERT, GPT

C++ 6,096 900 Updated Mar 27, 2024

Inference code for Llama models

Python 57,953 9,722 Updated Jan 26, 2025

Inference code for CodeLlama models

Python 16,253 1,901 Updated Aug 12, 2024

Inference Llama 2 in one file of pure C

C 18,236 2,223 Updated Aug 6, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,928 1,802 Updated Mar 27, 2025

Follow along with our tutorials to get you up and running with Snowflake.

Jupyter Notebook 384 731 Updated Mar 27, 2025

A curated list of references for MLOps

13,002 1,932 Updated Nov 21, 2024
Next
Showing results