Noureddine RAMDI / cocoindex-code: AST-aware semantic code search with efficient embedding integration

Created Tue, 05 May 2026 22:24:55 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

cocoindex-io/cocoindex-code

cocoindex-code tackles a common frustration: how to search code semantically without losing the structural precision of the source. Instead of just keyword matching or blind embedding searches, it parses your codebase into AST chunks — functions, classes, methods — and then computes embeddings over these meaningful units. This hybrid approach delivers more relevant search results and efficient context feeding to AI coding agents.

how cocoindex-code indexes and searches codebases

At its core, cocoindex-code is a semantic code search tool built in Python that operates on the AST (abstract syntax tree) level. It breaks down a codebase into semantically meaningful chunks — typically functions, classes, and methods — rather than arbitrary lines or files. This chunking ensures that search queries align with actual logical components in the code.

Once the code is chunked, cocoindex-code computes embeddings for each piece. These embeddings are vector representations that capture semantic similarity, enabling flexible search beyond exact text matches.

The tool supports two embedding backends:

  • Local embeddings via sentence-transformers, which work offline and require no API keys. This “batteries included” mode pulls in PyTorch and transformers and is bundled in the larger Docker image.

  • Cloud embeddings via LiteLLM, which connects to various cloud providers like OpenAI, Gemini, or Ollama. This mode is lighter since it avoids heavy local ML dependencies but requires API keys.

cocoindex-code exposes its search functionality in multiple ways:

  • A command-line interface (CLI) for direct user interaction.

  • An MCP server, which allows coding agents such as Claude Code or Codex to query the index transparently.

  • A Skill interface that these agents can consume directly, making the codebase context seamlessly available in their workflows.

To keep the index fresh, a background daemon runs continuously, automatically refreshing the index as the codebase changes. This means you get up-to-date semantic search without manual reindexing.

The project emphasizes zero-configuration setup — install, run, and start searching or integrating with agents immediately. The Docker images offer reproducible environments for teams, with two variants balancing image size and embedding backend choices.

the hybrid AST and embedding approach: why it stands out

Most code search tools fall into two camps: either purely AST-based, which require exact structural matches or name queries, or purely embedding-based, which can lose the semantic boundaries of code leading to noisy results.

cocoindex-code’s hybrid approach is worth understanding:

  • AST chunking aligns search results with code logic. Searching “authentication logic” finds the actual function or class implementing it, not just files mentioning the keyword. This improves precision dramatically.

  • Embeddings add semantic flexibility. The search isn’t limited to exact matches or brittle syntax queries but can capture synonyms, related concepts, and fuzzy matches.

  • Token saving for AI agents. By feeding agents precise chunks rather than entire files, cocoindex-code claims about 70% token savings. This is crucial when working with LLMs under token limits or cost constraints.

  • Integration with MCP protocol. This makes the code search invisible to AI agents, which can simply “ask” for relevant code context without managing indexing or retrieval details.

The codebase itself is surprisingly clean for a project juggling ML dependencies, daemon processes, and protocol servers. It strikes a balance between functionality and maintainability.

Tradeoffs are clear:

  • The “full” local embedding mode brings heavy dependencies (PyTorch, transformers), which inflate the Docker image to ~5 GB and limit Mac users to CPU inference inside containers.

  • The “slim” mode relies on cloud embeddings, which means external API keys and potential latency or cost considerations.

  • While zero-config is a strong plus, customization options for fine-tuning chunking or embedding models might be limited compared to more complex bespoke setups.

quick start

Install

Using pipx:

pipx install 'cocoindex-code[full]'          # batteries included (local embeddings)
pipx upgrade cocoindex-code                  # upgrade

Using uv:

uv tool install --upgrade 'cocoindex-code[full]'

Two install styles — they mirror the Docker image variants of the same names:

  • cocoindex-code[full] — batteries-included. Pulls in sentence-transformers so local embeddings (no API key required) work out of the box. The ccc init interactive prompt defaults to Snowflake/snowflake-arctic-embed-xs.
  • cocoindex-code (slim) — LiteLLM-only; requires a cloud embedding provider and API key. Use when you don’t want the local-embedding deps (~1 GB of torch + transformers).

Next, set up your coding agent integration — or jump to Manual CLI Usage if you prefer direct control.

Docker

A Docker image is available for teams who want a reproducible, dependency-free setup — no Python, uv, or system dependencies required on the host.

The recommended approach is a persistent container: start it once, and use docker exec to run CLI commands or connect MCP sessions to it. The daemon inside stays warm across sessions, so the embedding model is loaded only once.

Choosing an image

Two variants are published from each release:

TagSizeEmbedding backendsWhen to pick
cocoindex/cocoindex-code:latest (slim, default)~450 MBLiteLLM (cloud: OpenAI, Voyage, Gemini, Ollama, …)Most users. Cloud-backed embeddings, smaller image, fast pulls.
cocoindex/cocoindex-code:full~5 GBsentence-transformers (local) + LiteLLMWhen you want local embeddings without an API key, or an offline-ready container. Heavier because of torch + transformers.

verdict

cocoindex-code is a practical tool for teams or individuals looking to enhance code search with semantic capabilities while retaining structural precision. Its hybrid AST chunking plus embedding approach solves a real problem: finding the right logical code chunks rather than noisy keyword hits.

The zero-config setup and daemon-based index refresh reduce operational friction, making it accessible even if you don’t want to build your own indexing pipeline. Support for both local and cloud embeddings offers flexibility depending on your environment and privacy needs.

That said, if you want deep customization of chunking or embedding models, or if you need a lightweight tool without ML dependencies, this might feel heavyweight or restrictive. Mac users should note the CPU-only embedding inference limitation in Docker for the full mode.

For AI coding agents, cocoindex-code’s MCP integration is a neat feature, simplifying how agents access relevant code context without manual indexing steps.

Overall, if your workflow involves large or complex codebases and you want better semantic search with structural awareness, cocoindex-code is worth exploring. It’s especially relevant for teams integrating AI assistants that need precise, token-efficient code context.


→ GitHub Repo: cocoindex-io/cocoindex-code ⭐ 1,548 · Python