Skip to content
View SatyamSaxena1's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report SatyamSaxena1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SatyamSaxena1/README.md

Hi, I'm Satyam Saxena

AI/ML Researcher • Open‑source Developer • AI Ethics & Bias Mitigation • Advocate for Centrally‑Localized (On‑Device/Self‑Hosted) LLMs • Public Speaker

I build practical, privacy‑respecting AI systems that run locally and make a real impact. I care deeply about accessibility, model efficiency, and responsible AI. My work spans speech recognition, LLM fine‑tuning (LoRA), generative imaging, fuzzy logic control, and data tooling—always with a bias toward clear UX and simple, reproducible installs.

Profile views Email badge LinkedIn GitHub followers


Featured: Accessibility‑First Live Speech Captions

  • Project: Subtitles for Visual Impairment Assistance — local, lightweight live speech transcription
  • Why I built it: After being diagnosed with ear injuries and dealing with intermittent tinnitus, I relied on Chrome’s live captions. When I switched to Firefox and lost that feature, I built a locally hosted, open‑source alternative with a simple install.
  • Highlights:
    • Fully local; total footprint ~150 MB (deps + model)
    • Two modes:
      • Live: big, highlighted words in real time (TikTok/Reels‑style)
      • Comprehensive: continuous paragraph view for long talks where full context matters
    • Designed for accessibility, privacy, and low overhead
  • Repo: https://github.com/SatyamSaxena1/Subtitles-for-Visual-Impairment-Assistance-

Projects

  • Intelligent Joining Speed on Highways (Fuzzy Logic)

    • A fuzzy‑logic system that ingests lane condition, oncoming traffic, lane density, vehicle distance, etc., and recommends near‑instant optimal merge speed using >100 rules with Gaussian/logistic/singleton membership functions.
    • Repo: https://github.com/SatyamSaxena1/fuzzy-logic-highway-proj
  • Reddit Saved‑Posts → Mind‑Map/Notion Board


Top Skills

  • AI/ML: Speech recognition and transcription, NLP/LLMs (prompting + LoRA fine‑tuning), fuzzy logic systems, recommender systems, classical ML
  • Generative AI: Stable Diffusion (SDXL) pipelines, Kohya_ss training, BLIP, safetensors
  • Optimization & Efficiency: LoRA, Adafactor, BF16 precision, small‑footprint local deployments
  • Data: CSV ETL pipelines, structured data modeling, data segmentation, caption augmentation
  • Tooling & Frameworks: PyTorch, Hugging Face Diffusers, OpenCV, Tableau
  • Languages: Python, C++, R
  • Platforms & Hardware: Anaconda, NVIDIA T4 / RTX 3080 Ti; Google Colab, AWS
  • Communication: Public speaking, clear UX thinking, documentation, teamwork/leadership

Tech Stack

Python PyTorch Hugging Face Diffusers OpenCV Stable Diffusion SDXL R C++ Tableau Anaconda NVIDIA Google Colab AWS Linux Git


Education

  • MSc, Artificial Intelligence — Asia Pacific University of Technology and Innovation (APU/APIIT) & De Montfort University (Dual Degree), 2023–2024
    • Activities: Software Engineering Project Showcase, Open Source Contribution, Technology & Innovation Society, Programming Club
  • B.Tech, Computer Science & Engineering — G.D. Goenka University, 2022
    • Specialization in AI & Machine Learning

What I care about

  • Ethical AI and bias mitigation baked into the lifecycle
  • Privacy‑first, centrally‑localized/on‑device LLMs
  • Sustainable, efficient systems that are easy to install and use
  • Sharing knowledge through open source and public speaking

Get in touch

Email only: [email protected]

Pinned Loading

  1. Reddit-scrape-to-zettelkasten-obsidian-workflow Reddit-scrape-to-zettelkasten-obsidian-workflow Public

    From Reddit to Knowledge Graph: a Zettelkasten System from Saved Posts

    Python 10

  2. Subtitles-for-Visual-Impairment-Assistance- Subtitles-for-Visual-Impairment-Assistance- Public

    local live captioning using VB‑CABLE and Whisper, with accessibility-first UI and offline model support

    Python

  3. reddit-scraper-for-mind-map-project reddit-scraper-for-mind-map-project Public

    it is what it says, just keep in mind if it looks stuck, it isnt, its just adhering to the limits set by reddit

    Python

  4. music-speech-isolation music-speech-isolation Public

    Speech/music/noise separation for karaoke and ASR improvement. Includes Demucs integration, two-file outputs, and CI smoke tests.

    Python

  5. fuzzy-logic-highway-proj fuzzy-logic-highway-proj Public

    Python 1

  6. Whimper Whimper Public

    🎤 Live Audio Transcription using OpenAI Whisper v3 - Real-time speech-to-text with GPU acceleration and Voice Activity Detection

    Python