A simple web application that demonstrates running Gemma 3 LLM directly in the browser using ONNX runtime.
- Run Gemma 3 (1B) directly in your browser - no API keys needed
- Complete privacy - all processing happens locally
- Simple chat interface with markdown support
- Persists conversations between sessions
-
Install dependencies:
npm install https://github.com/huggingface/transformers.js/archive/new-model.tar.gz cd node_modules/@huggingface/transformers npm install npm run build cd ../../.. npm install
-
Run the development server:
npm run dev
-
Open http://localhost:3000 in your browser
- Modern web browser with WebAssembly support
- Recommended: At least 4GB of available RAM for model loading
This demo uses Next.js and the Hugging Face Transformers.js library to load and run the quantized ONNX version of Gemma 3 directly in your browser through WebAssembly.
- Next.js
- React
- Tailwind CSS
- Hugging Face Transformers.js
- ONNX Runtime Web