When organisations say they manage knowledge, they usually mean they store documents. VirgilLM manages what the documents actually say.
I designed and built VirgilLM to solve a problem that most organisations do not even recognise they have: their own documentation contradicts itself.
Policies conflict with procedures. Summaries drift from the sections they summarise. Terminology shifts meaning between departments. The information is all there, scattered across hundreds of files, and no one can tell which version of the truth is current. People query these documents every day and get confident, well-sourced answers that happen to be wrong, because the source material itself is inconsistent.
Search engines retrieve documents. They do not verify them. VirgilLM does.
What it does
VirgilLM is an enterprise knowledge management platform that processes organisational documents through three specialised analysis pipelines to build a grounded, contradiction-free knowledge base.
The Knowledge pipeline decomposes documents into atomic claims, each a single verifiable assertion with metadata: topic, conditions, source text, language. These claims are embedded as high-dimensional vectors and stored in PostgreSQL with pgvector. When new claims arrive, the system uses k-NN approximate nearest-neighbour search (HNSW indexes, top-40 candidates) to find semantically similar existing claims, then passes each candidate pair to an LLM for verification. The LLM classifies each pair as contradiction, tension, redundancy, or no issue. Only claims with no unresolved contradictions are promoted to active status and used in downstream queries.
The Consistency pipeline works within documents. It extracts discourse units from paragraphs with rich linguistic metadata: attribution, polarity, scope, modality, anchor terms. Then it runs twelve independent candidate generation strategies, six deterministic (same anchor, summary vs. body, rule vs. exception, cross-reference, proximity, conclusion vs. analysis) and six using a combination of LLM-assisted entity normalisation and embedding similarity. Each candidate pair is verified by an LLM, then subjected to a sceptical review pass by an independent model (Anthropic with extended thinking) that strips confidence scores to avoid confirmation bias and actively tries to reject findings. A final consolidation pass catches semantic duplicates that structural deduplication misses.
The Knowledge Graph pipeline extracts entities and relationships from documents, resolves duplicates via embedding similarity, and produces an interactive graph visualisation.
All three pipelines feed a RAG-powered chatbot that can be embedded on any website, answering questions only from verified, non-contradicted claims with full source attribution.
How it is built
The backend is Rust, built on Axum. The system runs on a dual-database architecture: a local PostgreSQL instance with pgvector handles all analysis data (documents, claims, embeddings, contradictions, discourse units, knowledge graph), while Supabase handles operational concerns (authentication, multi-tenancy, bot configuration, chat persistence, real-time progress updates). Forty-six Deno edge functions manage all administrative operations. The frontend is Astro with React islands.
Document ingestion runs through Azure Document Intelligence for OCR, table extraction, and section detection. Embeddings use Azure OpenAI text-embedding-3-large at 1536 dimensions. The system supports multiple LLM providers through a pluggable trait abstraction, currently Azure OpenAI for extraction and verification, Anthropic for the sceptical review pass.
The architecture enforces clear boundaries. Analysis scales independently of operations. Rate limiting runs at three levels: IP-based, per-bot origin-based, and a global LLM rate limiter. Sessions expire on TTL. All retrieval is tenant-scoped. Results mirror between the two databases at application level, not through cross-database foreign keys.
The design philosophy
VirgilLM is built on a principle that the current AI landscape largely ignores: use each technique for what it is actually good at.
Vector embeddings and k-NN search are fast and cheap. They excel at finding candidates, documents or claims that are semantically close enough to warrant a closer look. They are terrible at reasoning about whether two statements actually contradict each other. An LLM can do that, but running pairwise LLM comparisons across an entire corpus is computationally ruinous.
So the system uses k-NN to narrow the field, then LLMs to reason about what remains. Twelve independent candidate generation strategies ensure recall. A sceptical second-opinion LLM controls precision. The pipeline does not trust any single technique to carry the full weight of the problem.
This is not a prompt and a prayer. It is engineered verification with layered fallbacks, where traditional AI techniques and large language models each do the work they were designed for.
What this taught me
- The most effective AI systems are not the ones that use the most advanced model. They are the ones that use the right model at the right stage. k-NN for candidate generation, LLMs for semantic verification, a second independent LLM for sceptical review. Each layer exists because the one before it is not sufficient on its own.
- The contradiction problem in organisational knowledge is invisible until you build the tool to detect it. Every corpus I have processed has produced findings. Every single one. The scale of the problem is hidden precisely because no one looks.
- Rust is the right choice for a system that processes documents at scale, manages concurrent LLM calls, and must not lose data between pipeline stages. The language forces you to handle every edge case the type system can express. That discipline is not overhead. It is the architecture.
Agentic AI built on craft, not vibes. Every technique earns its place, or it does not ship.