Home / How a document becomes a verified, citable answer.
Technology

How a document becomes a verified, citable answer.

CVS is a hybrid RAG engine built for enterprises that cannot tolerate a confident wrong answer. Every stage — ingestion, routing, retrieval, and abstention — is engineered to produce evidence you can audit, not prose you have to trust.

Ingestion

A five-stage ingestion pipeline turns one document into searchable evidence.

CVS connects directly to where your knowledge already lives — SharePoint, Google Drive, Confluence, S3, and on-premise file servers — then parses every format through triple OCR and vision: PDFs, scans, DOCX, PPTX, XLSX, and images. Tables, figures, and page anchors survive parsing intact so the original evidence can be returned later, not paraphrased away.

Smart chunking produces semantically coherent fragments rather than blind fixed-width splits. Each chunk is enriched with entities, metadata, document diffs, and temporal facts, then written to a multi-layer index simultaneously: a pgvector store for semantic recall, a BM25F full-text index for exact terms, a Neo4j temporal knowledge graph for relationships, plus metadata and temporal indexes. One pass, five retrieval surfaces.

  • Connectors for SharePoint, Google Drive, Confluence, S3, and local file shares — no copy-paste migrations
  • Triple OCR plus vision enrichment across PDF, scanned PDF, DOCX, PPTX, XLSX, and images
  • Semantic chunking that preserves tables, figures, and page anchors as first-class evidence
  • Multi-layer indexing into pgvector, BM25F, Neo4j temporal knowledge graph, metadata, and temporal stores
A five-stage ingestion pipeline turns one document into searchable evidence.. CVS connects directly to where your knowledge already lives — SharePoint, Google Drive, Confluence, S3, and on-premise file servers — then parses every format through triple OCR and vision: PDFs, scans, DOCX, PPTX, XLSX, and images. Tables, figures, and page anchors survive parsing intact so the original evidence can be returned later, not paraphrased away.
Routing

An intent router sends each query down the cheapest path that can answer it.

Not every question deserves a full reasoning run. A central intent router classifies each query and dispatches it into one of four lanes: an instant, zero-token cache hit; a standard fast hybrid search; a deep multi-document synthesis; or an ultra reasoning path that decomposes the question into a directed acyclic graph of sub-queries.

This token-saving cascade means simple questions never wake up an expensive LLM, while genuinely hard, multi-document questions get the full decomposition treatment. The result is predictable latency, predictable cost, and no per-query token surprises — the cascade alone cuts LLM spend by 85–95% versus naive RAG.

  • Instant lane: zero-token cache for repeated and trivially answerable queries
  • Standard lane: fast hybrid search for the majority of everyday questions
  • Deep lane: multi-document synthesis when one source is not enough
  • Ultra lane: decomposition DAG that breaks complex questions into auditable sub-steps
An intent router sends each query down the cheapest path that can answer it.. Not every question deserves a full reasoning run. A central intent router classifies each query and dispatches it into one of four lanes: an instant, zero-token cache hit; a standard fast hybrid search; a deep multi-document synthesis; or an ultra reasoning path that decomposes the question into a directed acyclic graph of sub-queries.
Retrieval

5 parallel retrievers, fused by RRF, reranked by a cross-encoder.

CVS runs five retrievers at once — vector search, knowledge-graph traversal, BM25F full text, temporal retrieval, and metadata filtering. Each sees the corpus differently, so they catch different evidence: semantics, relationships, exact terms, time validity, and structured attributes. No single retriever has to be perfect.

Their ranked outputs merge through Reciprocal Rank Fusion (k=60), then a cross-encoder reranks the fused candidates to assemble a tight evidence set for the answer builder. This is why CVS reaches 94.7% answer accuracy versus the 67–73% typical of single-retriever systems like basic RAG or Copilot.

  • Vector (pgvector) + Neo4j knowledge graph + BM25F + temporal + metadata, all in parallel
  • Reciprocal Rank Fusion (k=60) merges five independent rankings into one consensus
  • Cross-encoder reranking sharpens the final evidence set before answer generation
  • 94.7% answer accuracy versus 67–73% for single-retriever systems
5 parallel retrievers, fused by RRF, reranked by a cross-encoder.. CVS runs five retrievers at once — vector search, knowledge-graph traversal, BM25F full text, temporal retrieval, and metadata filtering. Each sees the corpus differently, so they catch different evidence: semantics, relationships, exact terms, time validity, and structured attributes. No single retriever has to be perfect.
Abstention

Adversarial abstention: the system knows when it does not know.

After retrieval, CVS asks one question before answering: is the evidence sufficient? If yes, it answers with inline citations and writes the interaction to a tamper-evident audit log. If no, it abstains plainly instead of fabricating a plausible-sounding response — the single behavior that kills most enterprise RAG pilots.

An abstention is not a dead end. The unanswered question routes to the designated subject-matter expert, their verified answer is captured, and the knowledge base is patched so the next person gets an instant response. In production this drives hallucination below 2% versus roughly 19% for ordinary RAG.

  • Confidence gate evaluates evidence sufficiency before any answer is generated
  • Sufficient evidence → cited answer plus a full audit-log entry
  • Insufficient evidence → clear abstention, then expert escalation
  • Captured expert answers patch the base — under 2% hallucination versus ~19% for ordinary RAG
Adversarial abstention: the system knows when it does not know.. After retrieval, CVS asks one question before answering: is the evidence sufficient? If yes, it answers with inline citations and writes the interaction to a tamper-evident audit log. If no, it abstains plainly instead of fabricating a plausible-sounding response — the single behavior that kills most enterprise RAG pilots.

Run CVS against your hardest question.

Bring your most obscure spec or your most frequently escalated query. We will show you the evidence path end to end — and exactly what happens when the base does not know.