Languages: Kotlin · Java · C++ (JNI/NDK) · Python · Rust (exploring)
Android: Jetpack Compose · AOSP · NDK · AIDL · Accessibility Services · Android KeyStore
AI/ML:
- On-device inference: llama.cpp · ONNX Runtime · TensorFlow Lite · Whisper STT · Sherpa-ONNX
- Model formats: GGUF · ONNX · LoRA adapters · OTA weight delivery
- Techniques: GBNF · Multi-modal · Tool Calling · LoRA fine-tuning · Adapter switching · Model merging (TIES, DARE, SLERP)
- Training: SFT with LoRA · On-device PEFT (experimental) · Federated learning (exploring)
GPU / NPU Systems:
- Vulkan compute pipelines · ggml-vulkan submission model · timeline semaphores · GPU watchdog behavior
- UMA (Unified Memory Architecture) on mobile SoCs · Heterogeneous compute (CPU + GPU + NPU scheduling)
- GGML kernel paths: MMLA · I8MM · instruction-level runtime specialization
- Adreno GPU scheduling · low-priority compute queues · DEVICE LOST debugging
- Workload shape analysis: CLIP vs UNet dispatch behavior · conditional ramp tuning for mobile
- Smart ML-op offloading across CPU / GPU / NPU
Security & Cryptography: AES-256 · RSA · Hardware-backed Android KeyStore · Secure IPC · Encrypted pipelines
Architecture: Plugin SDK design · Modular runtime systems · MVVM · Clean Architecture · Dependency Injection
DevOps: GitHub Actions (CI/CD) · Gradle KTS · JUnit · Espresso · Crashlytics · Linux
Privacy-first, offline AI systems that run entirely on-device — no cloud, no compromise.
Not just apps. Ecosystems: plugin runtimes, inference cores, OTA adapter delivery, agent workflows, and the low-level GPU plumbing that makes it all actually run on mobile hardware.
📧 siddheshsonar2377@gmail.com 💼 LinkedIn
Open to: Startup roles · On-device AI research · SDK/tooling engineering · Cofounder conversations
Building AI that doesn't need the cloud to be intelligent.





