Codegraph: Building a Local Knowledge Graph for Claude Code — Fewer Tokens, Fewer Tool Calls

Claude Code has a chronic problem in large projects: it keeps scanning file trees, reading file contents, and searching for code references over and over. Every conversation burns tokens like water.

Codegraph's approach is straightforward — pre-build the project structure into a knowledge graph, store it locally, and Claude queries it directly instead of re-reading every time.

Core Idea

Traditional Claude Code workflow:

Receive a question
Read directory structure
Decide which files to read based on the directory
Search for specific symbol references
Iterate repeatedly

Codegraph transforms it into:

Build the index during project initialization (done once)
Claude queries the graph directly, getting file relationships and symbol references
Skip the repeated scanning and searching

One recent commit is interesting: "refactor: Remove semantic search and vector embedding functionality." The author deleted semantic search and vector embedding, moving toward a purely structured knowledge graph. This choice is worth pondering.

Semantic search sounds fancy, but it is not very practical in code scenarios. Code relationships are deterministic — function A calls function B, class C inherits from class D. These relationships don't need "semantic similarity" to determine — just query the graph directly. Cutting vector search reduces complexity, maintenance cost, and token consumption.

Measured Results

The project documentation claims:

Fewer tokens: No need to repeatedly transfer file contents to the model
Fewer tool calls: Graph queries get results in one shot, no multi-round searching
100% local: All index data stays local, no external services involved

256 commits, latest 18 hours ago. Recently adding Rust resolver workspace crate resolution support, showing the project is expanding language coverage. Currently supports more than one language (CLAUDE.md mentions Svelte language support).

Use Cases

Especially suitable:

Large codebases (thousands of files and up)
Scenarios requiring frequent cross-file analysis in Claude Code
Teams sensitive to API call costs

Less necessary:

Small projects (dozens to hundreds of files — Claude Code's native tool calls are fast enough)
Projects with extremely frequent structural changes (index needs rebuilding)

Versus Alternatives

Cline has its own context management mechanism, but it is generic — not specifically optimized for code structure. Codegraph's differentiation is right here — it does one thing: turn code structure into a graph, making agent queries faster.

Is this approach correct? I think the general direction is sound. Code understanding is about understanding relationships, not "semantics." Knowledge graphs are naturally suited for expressing relationships — more reasonable for code scenarios than for general text.

1,300+ stars, 11 open issues, 29 PRs. The project is in active development but hasn't reached a stable release stage yet. If you want to use it, try it on a small project first.

Primary sources:

colbymchenry/codegraph GitHub repository

Core Idea

Measured Results

Use Cases

Versus Alternatives

Related

ACC: Compiling Agent Trajectories into Long-Context QA for Direct Reasoning

RLVR Credit Assignment, Revisited: DelTA Takes a Discriminator View on Token-Level Rewards

Do MLLMs Really Read People? MM-OCEAN Finds 51% of "Correct Ratings" Are Guessing