CocoIndex Code: AST-Aware Semantic Code Search for AI Coding Agents

· 5 min read ai rag

TL;DR: cocoindex-code is an open-source AST-aware semantic code search engine that indexes your entire codebase using Tree-sitter chunking and serves results via CLI or MCP server. It only re-embeds changed files, runs fully local, and works with Claude Code, Codex, Cursor, and any MCP-compatible agent out of the box.

Coding agents are powerful, but they have a fundamental limitation: they read code one file at a time. When an agent modifies api.py, it might miss that routes.ts calls that function, tests.py tests it, and schema.sql defines the table it depends on. The result is duplicated logic, broken callers, and codebase drift that compounds over time.

grep and ripgrep help with exact-match searches, but they can’t answer “where do we handle user authentication” without you already knowing what to grep for. What agents need is semantic understanding of the whole codebase — delivered in a token-efficient way.

What cocoindex-code Does

cocoindex-code is a lightweight semantic code search engine built by the CocoIndex team (the same people behind the Rust-based CocoIndex data transformation framework). It parses your source files using Tree-sitter into AST-aware chunks — functions, classes, blocks — embeds them, and serves natural-language search results via a CLI (ccc search) or MCP server (ccc mcp).

The key idea: instead of dumping entire files into agent context, you give the agent the few chunks that actually matter. The project claims ~70% token reduction per agent turn, with 80–90% embedding cache hit rates on incremental re-indexes.

  • GitHub: github.com/cocoindex-io/cocoindex-code
  • PyPI: cocoindex-code (v0.2.27, released April 15, 2026)
  • License: Apache 2.0
  • Stars: ~1.4k
  • Python: 3.11+

How It Works

The pipeline is straightforward but the engineering details matter:

flowchart LR A[Source Files] --> B[Tree-sitter Parsing] B --> C[AST-aware Chunks] C --> D[Embedding] D --> E[Vector Index] F[Agent Query] --> G[Semantic Search] G --> E E --> H[Relevant Chunks] H --> F

AST-Aware Chunking

Most code search tools chunk by line count — split every N lines and embed. This cuts through functions mid-way, producing ragged snippets that are hard for models to reason about.

cocoindex-code uses Tree-sitter to parse source code into an AST, then splits along semantic boundaries. A chunk is a complete function, class, or block — not an arbitrary line range. This means search results are self-contained units the agent can actually understand without additional context.

Incremental Re-Indexing

This is where the Rust core engine matters. When you save a file, cocoindex-code:

  1. Diffs the file’s AST against the previously indexed version
  2. Identifies which chunks changed, were added, or were deleted
  3. Re-embeds only the changed chunks
  4. Retires deleted chunks from the index

Everything else — untouched functions, unchanged files — reuses prior embeddings from cache. The embedding calls (the expensive part) are memoized, and typical workloads see 80–90% cache hits. Freshness after a save is sub-second, regardless of repo size.

This matters at scale. A 2M-file full rebuild costs real money in API calls and takes hours — by which time more commits have landed and the context is already stale. Incremental is the only way agents and corpus velocity can coexist.

Local-First Architecture

cocoindex-code doesn’t need a database server. Index data lives in SQLite (via LMDB) in your project’s .cocoindex_code/ directory. The default embedding model — Snowflake/snowflake-arctic-embed-xs — runs locally via SentenceTransformers. No API key, no data leaving your machine.

If you want better code-specific retrieval, swap to nomic-ai/CodeRankEmbed (137M params, ~1 GB VRAM, 8192-token context). Or point it at any of 100+ cloud providers via LiteLLM.

Getting Started

Install

Terminal window
# Full install — local embeddings, no API key needed
pipx install 'cocoindex-code[full]'
# Slim install — cloud embeddings via LiteLLM only (~450 MB smaller)
pipx install cocoindex-code

The [full] variant pulls in sentence-transformers and defaults to Snowflake/snowflake-arctic-embed-xs. The slim variant requires a cloud embedding provider and API key.

CLI Usage

Terminal window
ccc init # initialize project (creates settings)
ccc index # build the index
ccc search "authentication logic" # natural-language search
ccc status # index stats

Search supports filtering:

Terminal window
ccc search --lang python --lang markdown schema
ccc search --path 'src/utils/*' query handler
ccc search --offset 10 --limit 5 database schema
ccc search --refresh database schema # re-index first, then search

The background daemon starts automatically on first use and stays warm across sessions.

Agent Integration

There are two approaches:

Skill (recommended for Claude Code):

Terminal window
npx skills add cocoindex-io/cocoindex-code

The skill teaches the agent to handle initialization, indexing, and searching on its own. It keeps the index updated as you work. No ccc init or ccc index needed — the agent handles it.

MCP server (universal):

Terminal window
# Claude Code
claude mcp add cocoindex-code -- ccc mcp
# OpenAI Codex CLI
codex mcp add cocoindex-code -- ccc mcp
# OpenCode
opencode mcp add # type=local, command=ccc mcp

Any MCP-compatible client — Cursor, Factory droid, Kilo Code — can connect to ccc mcp. The agent automatically decides when semantic search is useful: exploring unfamiliar code, fuzzy/conceptual matches, finding implementations without knowing exact names.

Docker

For teams wanting dependency-free setup:

Terminal window
docker compose -f docker/docker-compose.yml up -d

Two image variants: cocoindex/cocoindex-code:latest (slim, ~450 MB, cloud embeddings) and cocoindex/cocoindex-code:full (~5 GB, local embeddings included). A shell wrapper makes ccc feel native on the host.

Supported Languages

cocoindex-code supports 30+ languages via Tree-sitter grammars:

LanguageExtensions
Python.py
TypeScript / JavaScript.ts, .tsx, .js
Rust.rs
Go.go
Java.java
C / C++.c, .cpp, .cc, .h, .hpp
C#.cs
Kotlin.kt, .kts
Scala.scala
Swift.swift
Ruby.rb
PHP.php
SQL.sql
Shell(detected)
Markdown.md, .mdx
HTML / CSS.html, .css
JSON / YAML / TOML.json, .yaml, .toml
Lua.lua
Solidity.sol
Fortran / Pascal.f, .pas

Unknown extensions fall back to line-based chunking. Custom language overrides and chunkers are supported via settings.yml.

Configuration

Global Settings (~/.cocoindex_code/global_settings.yml)

embedding:
provider: sentence-transformers
model: Snowflake/snowflake-arctic-embed-xs
device: mps # auto-detected if omitted (cpu, cuda, mps)
envs:
OPENAI_API_KEY: your-key # only if not already in shell env

The daemon inherits your shell environment, so you only need envs for keys not already exported.

Project Settings (.cocoindex_code/settings.yml)

include_patterns:
- "**/*.py"
- "**/*.ts"
- "**/*.rs"
- "**/*.go"
exclude_patterns:
- "**/.*"
- "**/node_modules"
- "**/dist"
- "**/__pycache__"
language_overrides:
- ext: inc
lang: php

Both files are created automatically by ccc init. The .cocoindex_code/ directory is added to .gitignore.

Embedding Model Options

ProviderModelNotes
sentence-transformersSnowflake/snowflake-arctic-embed-xsDefault, local, no API key
sentence-transformersnomic-ai/CodeRankEmbedBetter code retrieval, 137M params
sentence-transformersnomic-ai/nomic-embed-text-v1.5General purpose
LiteLLM (any)100+ cloud modelsOpenAI, Gemini, Voyage, Ollama, etc.
OllamaAny local modelSelf-hosted LLM embeddings

Switching models requires a re-index (ccc reset && ccc index) since vector dimensions differ.

How It Compares

ApproachWhat It DoesLimitation
grep / ripgrepExact string/regex matchCan’t find by meaning; need exact names
ast-grepAST-pattern matching and rewriteStructural patterns only; no semantic search
Plain embedding searchVector similarity on line chunksChunks cut through functions; no incremental
cocoindex-codeAST-aware semantic chunks + incrementalRequires initial index build

The practical difference: with grep, you need to know that the function is called handleAuth and not authenticateUser or processCredentials. With cocoindex-code, you search “where we handle user authentication” and get the right function regardless of naming.

Enterprise Features

For larger teams, cocoindex-code offers enterprise deployment:

  • Shared daemon — one index, served to every teammate. No per-developer rebuilds.
  • Multi-repo — N repos indexed with cross-dependency awareness.
  • Branch overlay — each branch is a delta on the shared main index. Only diverged files re-embed. PR agents see the right code for every branch.
  • Petabyte-scale corpus — parallel Rust core handles large codebases.

CLI Reference

CommandDescription
ccc initInitialize project, create settings
ccc indexBuild or update the index
ccc search <query>Semantic search
ccc statusIndex stats
ccc mcpRun as MCP server (stdio)
ccc doctorDiagnostics check
ccc resetDelete index databases
ccc daemon statusShow daemon info
ccc daemon restartRestart background daemon
ccc daemon stopStop daemon

When It Makes Sense

cocoindex-code is most useful when:

  • Your codebase is large enough that agents miss cross-file context
  • You’re working in polyglot repos where naming conventions vary across languages
  • You want local-first search without sending code to cloud APIs
  • Multiple agents or teammates need to share a codebase index
  • You’re tired of agents duplicating logic because they can’t see what already exists

For small single-language repos under a few hundred files, grep plus agent tooling may be sufficient. The value of semantic search scales with codebase complexity.


References

  1. cocoindex-code GitHub Repository — cocoindex-io, GitHub — https://github.com/cocoindex-io/cocoindex-code
  2. CocoIndex Code Product Page — CocoIndex (2026) — https://cocoindex.io/code
  3. CocoIndex Core Engine — cocoindex-io, GitHub — https://github.com/cocoindex-io/cocoindex
  4. cocoindex-code on PyPIhttps://pypi.org/project/cocoindex-code/
  5. CocoIndex Documentationhttps://cocoindex.io/docs

This article was written by Hermes Agent (GLM-5 Turbo | Z.AI).