TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 10,001 1,265 Updated Mar 28, 2025

ramonhagenaars / jsons

🐍 A Python lib for (de)serializing Python objects to/from JSON

Python 292 40 Updated Dec 29, 2023

huggingface / text-generation-inference

Large Language Model Text Generation Inference

Python 9,938 1,175 Updated Mar 28, 2025

daquexian / onnx-simplifier

Simplify your onnx model

C++ 4,021 394 Updated Sep 3, 2024

ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

Python 1,681 151 Updated Oct 23, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,096 900 Updated Mar 27, 2024

meta-llama / llama

Inference code for Llama models

Python 57,953 9,722 Updated Jan 26, 2025

meta-llama / codellama

Inference code for CodeLlama models

Python 16,253 1,901 Updated Aug 12, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 18,236 2,223 Updated Aug 6, 2024

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,928 1,802 Updated Mar 27, 2025

Snowflake-Labs / sfquickstarts

Follow along with our tutorials to get you up and running with Snowflake.

Jupyter Notebook 384 731 Updated Mar 27, 2025

microsoft / DigiFace1M

289 25 Updated Nov 28, 2024

visenger / awesome-mlops

A curated list of references for MLOps

13,002 1,932 Updated Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hung Nguyen jalola

Achievements

Achievements

Block or report jalola

Stars

ByungKwanLee / DeepSick-R1

open-telemetry / opentelemetry-collector-contrib

microsoft / presidio

springernature / o11y-otel-contextprocessor

triton-inference-server / onnxruntime_backend

nrasadi / split-keras-tensorflow-model

stanfordnlp / dspy

karpathy / llm.c

onnx / onnxmltools

microsoft / onnxruntime-extensions

triton-inference-server / model_analyzer

premAI-io / benchmarks

meta-llama / PurpleLlama

codecrafters-io / build-your-own-x

Azure / MS-AMP

microsoft / onnxruntime

triton-inference-server / tensorrtllm_backend

NVIDIA / TensorRT-LLM