RAG has been running for three years, and everyone accepted a premise: you slice documents, embed them, store in a vector database, then do similarity matching for retrieval.
PageIndex says: no need.
Its tagline: "Document Index for Vectorless, Reasoning-based RAG." 30.6k stars, gained 4,300 in a week.
This growth rate tells one thing: frustration with existing RAG solutions has accumulated to a tipping point.
What's wrong with vector databases
Vector databases work fine. But they're awkward in a specific scenario—when your document structure is complex and you need to understand semantic relationships, not just text similarity.
Example: a 200-page technical doc with chapters on "authentication flow," "API permissions," and "OAuth integration." User asks "how to configure third-party login."
Traditional vector RAG: embed the question, match against all document fragment embeddings, take top-k fragments to the LLM.
Problem: similarity matching only finds textually close fragments, not logically related ones. "Third-party login" might be covered in all three chapters, but only "OAuth integration" has the closest embedding to the question.
PageIndex's approach: instead of embedding-based similarity matching, let the LLM directly "reason" which document fragments are relevant.
How it works
From the code structure:
pageindex/: core indexing logiccookbook/: usage examplesexamples/: specific scenario implementations
It uses agent-based retrieval strategy—one agent understands the query, another locates relevant content in the document index, bypassing vector similarity entirely.
The code uses LiteLLM for model routing, supporting multiple LLM backends. Recent commits optimize retrieve_model auto-prefix handling.
Performance
This is the key question, and the docs aren't clear enough about it yet.
Vectorless RAG costs: every retrieval calls the LLM for reasoning, not a near-neighbor search in vector space.
- Higher latency: vector search is millisecond-scale, LLM reasoning is second-scale
- Higher cost: every retrieval consumes LLM tokens
- Poorer concurrency: LLM reasoning is compute-intensive, vector search is memory-intensive
If PageIndex can't provide convincing data on these, "vectorless" is just marketing.
But from another angle: if your RAG scenario isn't latency or cost-sensitive (offline document analysis, research report generation), vectorless might actually be more accurate—because LLM comprehension does exceed embedding similarity.
What 283 commits means
30.6k stars but only 283 commits. This ratio is unusual.
Either the codebase is small but the concept is strong, or significant code isn't in the public repo. Either way, the project is far from mature.
78 open issues, 68 open PRs. High community participation, but core team capacity may not keep up.
Is it worth trying
Worth considering for:
- Complex document enterprise knowledge bases needing cross-chapter understanding
- Offline analysis tasks with high accuracy requirements but low latency needs
- Small teams that don't want to maintain vector database infrastructure
Not recommended for:
- High-concurrency, low-latency real-time retrieval
- Massive document volumes (million+)
- Cost-sensitive production environments
Judgment
Vectorless RAG is an interesting direction. It essentially asks: since LLMs keep getting stronger, why use lossy intermediate representations like embeddings for retrieval? Why not just let the LLM understand the documents directly?
The answer depends on where the LLM capability growth curve and cost reduction curve intersect. For high-value, low-concurrency scenarios, vectorless already has practical value. For large-scale, high-frequency retrieval, vector databases remain more pragmatic in the short term.
Whether PageIndex can run with this direction, the next few months of performance optimization and community feedback will tell.
Main sources: