samos123

🎯

Focusing

Sam Stoelinga samos123

🎯

Focusing

Creator of KubeAI, a K8s operator to serve LLMs in production.

128 followers · 89 following

San Francisco Bay Area
https://www.kubeai.org/
in/samstoelinga

Achievements

x3 x3 x2

Achievements

x3 x3 x2

Pinned Loading

substratusai/kubeai Public

AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.

Go 860 79
websu-io/websu Public

Website Speed and Performance Optimization and monitoring

Go 108 10
docker-veth Public

show which veth interface is associated with each container

Shell 21 4
docker-drupal Public

Drupal image based official php image, database information should be passed by environment variables or linked container. Drush is included

PHP 34 10
gke-gcs-fuse-unprivileged Public

29 3
chatgpt-blog Public

Prototype using chatGPT to anser Golang Stackoverflow questions

HTML 26 3

874 contributions in the last year

Learn how we count contributions

Less

March 2025

Created 33 commits in 4 repositories

Created 3 repositories

samos123/axlearn-aid Shell
This contribution was made on Mar 29
samos123/Awesome-LLMOps Shell
This contribution was made on Mar 25
samos123/HighPerfLLMs2024 Python
This contribution was made on Mar 7

Opened 17 pull requests in 3 repositories

substratusai/kubeai 3 open 10 merged 2 closed

wip: add kafka doc
This contribution was made on Mar 28
update vLLM GH200 image to 0.8.2
This contribution was made on Mar 28
bump helm chart versions
This contribution was made on Mar 26
add gemma 3 12b and 24b ollama on l4
This contribution was made on Mar 26
update vllm to 0.8.2
This contribution was made on Mar 26
wip update vLLM GH200 image to 0.8.1
This contribution was made on Mar 26
Add model mistral 3.1 small on 1x H100
This contribution was made on Mar 21
update vLLM image for GPU to 0.8.1
This contribution was made on Mar 20
bump helm chart versions
This contribution was made on Mar 16
Infinity engine e2e test
This contribution was made on Mar 13
simplify rollouts and fix old pod not getting deleted
This contribution was made on Mar 8
Support changing URL when cacheProfile is used
This contribution was made on Mar 5
add goatcounter analytics to kubeai.org website
This contribution was made on Mar 4
bump helm chart versions
This contribution was made on Mar 4
Url mutable with cache profile
This contribution was made on Mar 1

substratusai/vllm-docker 1 merged

Add image for Gh200
This contribution was made on Mar 28

tensorchord/Awesome-LLMOps 1 merged

Add KubeAI
This contribution was made on Mar 25

Reviewed 17 pull requests in 3 repositories

substratusai/kubeai 14 pull requests

fix: use correct accessKeyID values in if statement
This contribution was made on Mar 28
fix: support accessing models from private S3 buckets
This contribution was made on Mar 26
Add AKS installation guide
This contribution was made on Mar 25
fix: Include minReplicas even when value is 0
This contribution was made on Mar 25
WIP Proposal: Multi-spec Models
This contribution was made on Mar 24
Strongly typed OpenAI payloads
This contribution was made on Mar 18
E2e test updates
This contribution was made on Mar 18
Ollama load model from PVC
This contribution was made on Mar 13
Add sticky session section to paper
This contribution was made on Mar 5
add goatcounter analytics to kubeai.org website
This contribution was made on Mar 4
Add .spec.files to Model
This contribution was made on Mar 4
Make url mutable when cacheProfile is not used
This contribution was made on Mar 2
Update arch diagram
This contribution was made on Mar 2
Update to readme
This contribution was made on Mar 1

apple/axlearn 2 pull requests

use "true" and "false" instead of 0 and 1
This contribution was made on Mar 29
Add GKE A3 Ultra support
This contribution was made on Mar 7

GoogleCloudPlatform/ml-auto-solutions 1 pull request

Added project_bite_tpu_unittests and associated config to test axlear…
This contribution was made on Mar 5

Created an issue in substratusai/kubeai that received 2 comments

Mar 16

CPU and GPU Hybrid Mode: Hot Ollama with CPU and cold vLLM for GPU

Use case: GPUs are expensive and don't need them 24/7 however TTFT needs to stay low for cold starts. Example for Llama 3.3 70B: Ollama with CPU a…

2 comments

Opened 3 other issues in 1 repository

substratusai/kubeai 3 open

vLLM: Disaggregated Serving support
This contribution was made on Mar 21
LoRA Adapters from S3 Compatible Storage Server
This contribution was made on Mar 8
Add doc that explains how rollouts are handled
This contribution was made on Mar 2

	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb	Mar
Sun
Mon
Tue
Wed
Thu
Fri
Sat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sam Stoelinga samos123

Achievements

Achievements

Block or report samos123

Pinned Loading

874 contributions in the last year

Contribution activity

March 2025

Created an issue in substratusai/kubeai that received 2 comments

CPU and GPU Hybrid Mode: Hot Ollama with CPU and cold vLLM for GPU