Nanobrowser: multi-agent AI browser automation with dynamic self-correcting planning

Nanobrowser tackles browser automation in a way that stands out from typical prompt-to-action tools. It uses a multi-agent AI system running entirely in your browser, splitting reasoning and execution between two specialized agents. What makes it worth a closer look is how its Planner agent continually monitors and adjusts the Navigator agent’s actions when things don’t go as planned — all automatically and without sending your credentials anywhere.

What nanobrowser does and how it works

At its core, Nanobrowser is a Chrome extension written in TypeScript that implements a multi-agent AI architecture for automating browser tasks using large language models (LLMs). It positions itself as an open-source alternative to OpenAI Operator, with a strong emphasis on privacy and flexibility.

The architecture clearly separates concerns into two main agents:

Planner agent: This agent is responsible for reasoning, task decomposition, and planning. It takes high-level user instructions and breaks them down into actionable steps.
Navigator agent: This agent handles direct interaction with the browser DOM, executing the steps devised by the Planner.

Both agents run client-side within the browser extension, ensuring that API keys and sensitive credentials never leave the user’s machine. This local-first model is a strong privacy feature, often missing from cloud-based automation tools.

Nanobrowser supports a wide range of LLM providers including OpenAI, Anthropic, Gemini, Ollama, Groq, and any OpenAI-compatible endpoints. This per-agent configurable model selection gives users flexibility to optimize cost, latency, or capability based on their tasks.

The extension is built with a pnpm-based build pipeline, making it accessible for contributors to build and extend. The user interface exposes a sidebar for configuration and interaction, making it straightforward to add API keys and select models.

The planner-navigator self-correcting loop: what sets nanobrowser apart

The standout feature of Nanobrowser is its dynamic self-correction mechanism between the Planner and Navigator agents. Unlike simpler AI automation tools that send fixed commands to the browser, Nanobrowser’s Planner continuously monitors the Navigator’s progress.

If the Navigator encounters a problem — for example, an unexpected DOM structure or a failed click — the Planner detects this failure and reasons about alternative approaches. It then dynamically replans the Navigator’s actions to try a different strategy.

This feedback loop is implemented entirely client-side, avoiding latency and privacy concerns associated with cloud round-trips. It allows Nanobrowser to handle edge cases and variability in web pages more robustly than static scripted agents.

Under the hood, this involves a communication protocol between the two agents where the Planner receives execution status updates and error signals from the Navigator. The Planner can then issue revised instructions or ask for clarifications based on the execution context.

The tradeoff here is complexity and dependency on the underlying LLM quality. The system relies heavily on the language model’s ability to reason about failures and generate alternative navigation plans on the fly. This is a powerful pattern but can vary in reliability depending on the model and prompt engineering.

The codebase is surprisingly clean for such a complex coordination task, with clear separation of concerns and extensible abstractions for LLM providers and agent roles. The local execution model means the extension must be efficient and minimal enough to run smoothly in a browser environment, which the TypeScript implementation supports.

Quick start

Getting started with Nanobrowser is straightforward if you want to try it out:

## 🚀 Quick Start

1. **Install from Chrome Web Store** (Stable Version):
   * Visit the Nanobrowser Chrome Web Store page
   * Click "Add to Chrome" button
   * Confirm the installation when prompted

> **Important Note**: For latest features, install from "Manually Install Latest Version" below, as Chrome Web Store version may be delayed due to review process.

2. **Configure Agent Models**:
   * Click the Nanobrowser icon in your toolbar to open the sidebar
   * Click the `Settings` icon (top right)
   * Add your LLM API keys
   * Choose which model to use for different agents (Navigator, Planner)

## 🔧 Manually Install Latest Version

To get the most recent version with all the latest features:

1. **Download**
    * Download the latest `nanobrowser.zip` file from the official Github release page.

2. **Install**:
    * Unzip `nanobrowser.zip`.
    * Open `chrome://extensions/` in Chrome
    * Enable `Developer mode` (top right)
    * Click `Load unpacked` (top left)
    * Select the unzipped `nanobrowser` folder.

3. **Configure Agent Models**
    * Click the Nanobrowser icon in your toolbar to open the sidebar
    * Click the `Settings` icon (top right).
    * Add your LLM API keys.
    * Choose which model to use for different agents (Navigator, Planner)

4. **Upgrading**:
    * Download the latest `nanobrowser.zip` file from the release page.
    * Unzip and replace your existing Nanobrowser files with the new ones.
    * Go to `chrome://extensions/` in Chrome and click the refresh icon on the Nanobrowser card.

Verdict

Nanobrowser offers a compelling example of multi-agent AI applied to browser automation with a focus on privacy and dynamic self-correction. Running both the Planner and Navigator agents client-side reduces privacy risks and latency but also means the extension’s reliability hinges on the quality of the underlying LLMs and their prompt engineering.

This repo is particularly relevant if you want to experiment with AI-driven browser automation beyond simple scripted tasks and explore architectures that combine reasoning with execution in a feedback loop. It’s also a good starting point if you need support for multiple LLM providers and want to keep your API keys local.

That said, expect some edge cases and complexity in real-world usage; browser automation remains brittle, and AI-driven strategies are still maturing. Nanobrowser’s approach is promising, but not a turnkey production solution yet. For developers interested in multi-agent orchestration patterns or local-first AI privacy models, it’s definitely worth a look under the hood.

LLM-driven browser automation with Browser-Use: a hands-on look — Browser-Use is a Python library enabling LLM-powered AI agents to automate browsers efficiently. It features a custom Ch
Browser Harness: a self-healing LLM agent for browser automation via Chrome DevTools — Browser Harness enables LLMs to automate browsers by dynamically generating helper functions using the Chrome DevTools P
PinchTab: Token-efficient Chrome automation for AI agents with Go — PinchTab is a Go HTTP server enabling AI agents to control Chrome instances efficiently by extracting structured text, c
AgentGPT: building autonomous AI agents with a full-stack web platform — AgentGPT offers a full-stack solution to deploy autonomous AI agents in the browser using Next.js, FastAPI, and Langchai
Crawlee: a TypeScript library for stealthy web scraping and browser automation — Crawlee is a TypeScript library for web scraping and browser automation with human-like stealth. Supports Playwright, Pu

→ GitHub Repo: nanobrowser/nanobrowser ⭐ 12,859 · TypeScript

Noureddine RAMDI / Nanobrowser: multi-agent AI browser automation with dynamic self-correcting planning

What nanobrowser does and how it works

The planner-navigator self-correcting loop: what sets nanobrowser apart

Quick start

Verdict

Related Articles