OpenGuider is an AI assistant that runs as a desktop app, designed to watch your screen, listen to your voice, and guide you through tasks with step-by-step instructions enhanced by coordinate-based pointer hints. Unlike many AI helpers locked into a single provider or feature set, OpenGuider embraces modularity and local control, making it a compelling tool for those wanting a hands-free, customizable AI workflow on their machines.
What OpenGuider does and its architecture
OpenGuider is built with Electron, a popular framework for cross-platform desktop apps using web technologies. It integrates multiple large language model (LLM) providers such as Claude, OpenAI, Gemini, Groq, OpenRouter, and Ollama. This multi-provider support allows users to balance cost, latency, and quality according to their preferences or project needs.
The core functionality revolves around producing step-by-step guidance for tasks on the user’s screen. It does so by watching screen content, listening to voice commands, and providing visual hints anchored to screen coordinates. This coordinate-based approach is crucial because it allows precise UI navigation guidance rather than generic instructions.
A key architectural choice is the plugin system. Instead of hardcoding automation features, OpenGuider treats browser automation as just one plugin among potentially many specialized workspaces. The first live plugin, called “browser-use,” can navigate websites, fill forms, and pause for user approval on risky actions. This plugin operates with user safety in mind, requiring explicit approval before executing sensitive steps.
Privacy and security are central to OpenGuider’s design. It is local-first: user settings and session history are stored on disk locally. API keys for AI providers are securely stored using keytar or an encrypted fallback. The app only sends data externally to explicitly configured AI providers, minimizing unintended data exposure.
Technical strengths and design tradeoffs
What stands out technically is the modular plugin architecture that decouples core AI guidance from specific automation implementations. This design makes OpenGuider flexible and extensible, allowing new specialized plugins to be added without changing the core app. Browser automation as a plugin rather than a baked-in feature reduces complexity and lets users opt into capabilities they need.
The multi-provider orchestration of LLMs is another notable aspect. By supporting various providers, OpenGuider offers a pragmatic way to optimize for cost, latency, or quality tradeoffs depending on the use case. This flexibility is rare among desktop AI assistants, which often tie users to a single LLM service.
Voice-first interaction is built into the app, with optional speech-to-text and text-to-speech features. This hands-free approach fits well with modern workflows where users want to keep their hands on the keyboard or mouse while receiving AI assistance.
On the code quality front, the project is JavaScript-based with Electron, which means the codebase is accessible to many developers familiar with web technologies. The use of keytar for secure storage shows attention to security best practices in desktop apps. However, Electron apps tend to have a larger footprint and can be resource-hungry compared to native solutions, which is a tradeoff worth considering.
One limitation is that the plugin system, while powerful, is still in early stages with only the browser automation plugin live. The ecosystem will need time to grow to cover more specialized tasks, and users should expect some rough edges until then.
Quick start
OpenGuider provides straightforward installation options:
Option A: Download Prebuilt App (Recommended)
- Open the latest release page: https://github.com/mo-tunn/OpenGuider/releases/latest
- Download your platform artifact:
- Windows:
OpenGuider-windows-setup-latest.exe - macOS:
OpenGuider-macos-installer-latest.dmg - Linux:
OpenGuider-linux-latest.zip
- Windows:
- Extract and run the app.
- If you want browser automation, open
Settings -> Plugins. - In the Browser plugin card, click
Download Runtimeonce. - Choose whether browser tasks should run with approval or in autopilot mode.
Option B: Run From Source
- Install dependencies:
npm install - Start the app:
npm run start
Step 4: Validate the Setup
Send a simple prompt first, for example:
- “Open settings and guide me to configure notifications step by step.”
Then try a planning-style prompt:
- “Help me complete this task in 5 steps and wait for confirmation after each step.”
Then try a plugin-style prompt:
- “Search the web for the official OpenAI API docs and pause before opening any sign-in page.”
- “Use the browser plugin to find a product page, but ask me before submitting or checking out.”
Verdict
OpenGuider is a solid choice if you want a desktop AI assistant focused on screen-aware, voice-driven guidance with a modular architecture that can grow over time. Its local-first design respects privacy and keeps user data under control. The multi-provider support for LLMs is practical for balancing cost and performance.
The browser automation plugin adds real-world utility, although the plugin ecosystem is still nascent, so expect to invest some time if you want to extend or customize beyond the current capabilities.
If you are comfortable with Electron apps and appreciate a voice-first interface coupled with secure local storage, OpenGuider is worth trying. For users looking for tightly integrated native apps or a mature plugin marketplace, it might feel a bit early-stage.
Overall, OpenGuider solves a real problem: how to get AI assistance that understands your screen context and guides you step-by-step, not just via chat. Its modular design and local-first approach make it a noteworthy project in the desktop AI assistant space.
Related Articles
- Open Cowork: Desktop AI Agent with VM-level Sandbox Isolation for Safer AI Workflows — Open Cowork wraps multiple LLMs in a cross-platform desktop app with unique VM-level sandboxing using WSL2 and Lima for
- Elato-Local: a local voice AI platform bridging desktop and embedded IoT on Apple Silicon — Elato-Local is a local voice AI platform combining Whisper ASR, local LLMs, and ESP32-S3 firmware flashing from a Tauri
- opik-openclaw: bridging OpenClaw AI agents with Opik observability — opik-openclaw plugs into OpenClaw to export detailed AI agent traces and usage data to Opik’s LLM observability platform
- Nanobrowser: multi-agent AI browser automation with dynamic self-correcting planning — Nanobrowser is a TypeScript Chrome extension implementing a multi-agent AI system for browser automation with a unique s
- LLM-driven browser automation with Browser-Use: a hands-on look — Browser-Use is a Python library enabling LLM-powered AI agents to automate browsers efficiently. It features a custom Ch
→ GitHub Repo: mo-tunn/OpenGuider ⭐ 151 · JavaScript