Noureddine RAMDI / Voice Satellite: local wake word detection in the browser for Home Assistant voice assistants

Created Mon, 04 May 2026 10:23:02 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

jxlarrea/voice-satellite-card-integration

Voice Satellite takes a different approach to voice assistants by running wake word detection entirely on the client side, inside a web browser. This means zero server involvement for detecting when you start talking — only after the wake word fires does audio stream to Home Assistant’s backend for speech-to-text, conversation, and text-to-speech processing. It’s a clever design that prioritizes privacy and responsiveness by cutting out server-side wake word processing.

what Voice Satellite does and how it integrates with Home Assistant

Voice Satellite is a custom component for Home Assistant that transforms any modern web browser into a full “assist_satellite” device. The key innovation is the use of microWakeWord — a TensorFlow Lite model running entirely in JavaScript — to perform on-device wake word detection. This lets the browser listen locally for a wake word without streaming audio continuously to the backend.

Under the hood, the architecture is tightly integrated with Home Assistant’s Assist Pipeline, which handles the speech-to-text (STT), conversation agent, and text-to-speech (TTS) tasks. Voice Satellite registers as a proper Home Assistant media_player entity, allowing it to fit naturally into the Home Assistant ecosystem.

The integration supports dual wake words, each routed to separate Assist Pipelines, enabling flexible voice routing scenarios. It exposes automation actions such as announce, start_conversation, ask_question, and wake, which can be triggered in Home Assistant automations.

The voice pipeline is fully local until the wake word fires:

  • The client browser listens continuously for the configured wake word(s) using microWakeWord.
  • When the wake word triggers, audio streaming starts and sends data to the STT engine configured in Home Assistant.
  • The recognized text is passed to the conversation agent.
  • The response is played back via TTS locally in the browser.
  • Multi-turn conversations are supported with follow-ups handled seamlessly.

On top of this core voice functionality, Voice Satellite includes 8 skinnable user interfaces, a screensaver feature with camera or folder image support, voice timers, and experimental large language model (LLM) tools for web, image, and weather search.

The stack is JavaScript-heavy on the frontend running in the browser, with Home Assistant managing the backend pipelines. It requires Home Assistant 2025.2.1 or later and a configured Assist Pipeline with STT, conversation, and TTS components.

technical strengths and architectural tradeoffs

The standout technical feature here is the on-device wake word detection using microWakeWord’s TFLite models in pure JavaScript. Running wake word detection entirely in the browser is unusual. Most voice assistants rely on server-side wake word detection or native apps with dedicated libraries. This approach offers a strong privacy advantage — no audio leaves the device until you say the wake word.

From an engineering perspective, running TensorFlow Lite models in JS in a browser environment is challenging due to resource constraints and timing requirements. The project cleverly auto-discovers custom .tflite wake word models, making it extensible.

The integration with Home Assistant’s Assist Pipeline is opinionated but pragmatic. It delegates heavy lifting like STT, conversation, and TTS to Home Assistant, focusing the browser component on wake word detection, audio capture, and playback. The media_player entity pattern used fits Home Assistant conventions well, improving DX for users.

Supporting dual wake words routed to separate pipelines is a nice touch, enabling scenarios like distinguishing between different users or commands.

The inclusion of multiple skinnable UIs and a screensaver indicates attention to user experience and deployment environments, such as wall tablets or kiosks.

The tradeoff here is that the wake word detection is limited by what TensorFlow Lite models can achieve in JS on typical browsers. While impressive, it may not match the accuracy or latency of native or server-based solutions. Also, the need for Home Assistant 2025.2.1+ and a configured Assist Pipeline means this is not plug-and-play for casual users.

installation and setup commands

## Prerequisites

- **Home Assistant 2025.2.1** or later
- An Assist Pipeline with:
  - Speech-to-Text (Whisper, OpenAI, etc.)
  - Conversation agent (Home Assistant, OpenAI, Qwen, etc.)
  - Text-to-Speech (Piper, Kokoro, etc.)

Voice Satellite requires microphone access, so make sure that:

1. **The browser has microphone permissions granted** - you will be prompted on first use.
2. **The page is served over HTTPS** - required for microphone access in modern browsers.
3. **The screen stays on** - if the device screen turns off completely, the microphone will stop working. Use a screensaver instead of screen-off to keep the mic active.

For kiosk setups like Fully Kiosk Browser, make sure to enable microphone permissions and use the screensaver feature (not screen off) to keep the microphone active while dimming the display.

For the **Home Assistant Companion App**, enable **Autoplay videos** in Settings -> Companion App -> Other settings. Without this, the WebView will block TTS audio playback.

## Installation

### HACS (Recommended)

Voice Satellite is available in HACS. Use the link below to open the HACS repository in Home Assistant.

Or search for `Voice Satellite` in the HACS default repository.

### Manual

1. Download the latest release ZIP file
2. Copy the `custom_components/voice_satellite` folder to your `config/custom_components/` directory
3. Restart Home Assistant

## Setup

1. Go to **Settings -> Devices & Services -> Add Integration**
2. Search for **Voice Satellite**
3. Enter a name for the device (e.g., "Kitchen Tablet")
4. Repeat for each browser/tablet that will act as a satellite
5. On each browser/tablet, open the **Voice Satellite** sidebar panel
6. Select the satellite entity you created for this device
7. Configure wake word, audio, and appearance settings as needed
8. The engine starts automatically once an entity is assigned - if the browser blocks auto-start due to a missing user gesture, a floating microphone button will appear; tap it

verdict: who should consider Voice Satellite

Voice Satellite is a solid choice if you’re invested in Home Assistant and want a privacy-first voice assistant setup that avoids server-side wake word detection. Its local on-device wake word detection in the browser is a clever engineering solution that reduces latency and data exposure.

That said, the reliance on Home Assistant 2025.2.1+ and a configured Assist Pipeline means it’s best suited for users comfortable with Home Assistant’s ecosystem and voice assistant pipelines. Casual users or those seeking a fully standalone voice assistant may find the setup and dependencies cumbersome.

The accuracy and latency of the JavaScript microWakeWord models are good but not state-of-the-art compared to native or cloud solutions. However, the tradeoff is clear: better privacy and control at the cost of some precision.

Overall, if you want tight Home Assistant integration and a browser-based voice satellite device with local wake word detection, this repo is worth exploring. The code quality appears solid, and the multiple UI options and automation actions show a mature project focused on real-world deployment.


→ GitHub Repo: jxlarrea/voice-satellite-card-integration ⭐ 311 · JavaScript