PLATFORM ARCHITECTURE

Five layers. One system.
Zero gaps.

KnowledgeBricks is not a wrapper around a generic LLM. It is a purpose-built knowledge platform, five engineered layers that transform practitioner-authored content into cited, accurate, domain-specific answers delivered in under 2 seconds.

5 Platform Layers
SOC 2 Type II Certified
99.9% Uptime SLA
<2s Average Query Time
THE STACK

Every layer is deliberate.
Every choice is documented.

The platform is built on five layers that work together as a single system. No layer is optional. Each one enforces trust guarantees that the next layer depends on.

01
CONTENT VAULT

Practitioner-authored articles, benchmarks, and structured methodology

The foundation of KnowledgeBricks is not a model, it is the content. Every article is written by a domain expert who has done the operational work: former 3PL directors, supply chain practitioners, construction estimators with decades of field experience. The vault contains over 140 articles, each tagged with topic, sub-domain, tier access level, and citation metadata. No AI-generated content. No scraped web copy.

02
INGESTION PIPELINE

OCR, chunking, frontmatter tagging, and tier gating

Raw content enters a structured pipeline: OCR for scanned documents, deterministic chunking by concept boundary (not word count), frontmatter tagging that assigns access tier, topic taxonomy, and source attribution. Every chunk carries its metadata into the embedding layer. The tier gate is enforced here, locked content is marked before embedding, never stripped after. This is why paywall integrity holds at query time.

03
EMBEDDING & RETRIEVAL

OpenAI text-embedding-3-large, vector search via Supabase pgvector

Each chunk is embedded using OpenAI's text-embedding-3-large model, chosen for domain-specific semantic precision over cost-optimized alternatives. Vectors are stored and queried via Supabase pgvector with cosine similarity search. Retrieval is constrained by tier at query time: a free-tier user's query never reaches locked embeddings, regardless of semantic relevance. Top-k retrieval returns ranked chunks with full source attribution intact.

04
LLM ANSWER LAYER

Claude Sonnet with curated system prompts, citation enforcement, paywall integrity

Retrieved chunks and the user query are passed to Anthropic's Claude Sonnet via a curated system prompt that enforces: (1) cite every factual claim with its source chunk, (2) do not infer beyond the retrieved content, (3) flag knowledge gaps explicitly rather than speculating. The prompt is versioned and tested against a domain-specific evaluation harness. The LLM never receives content from locked tiers, the system prompt cannot override the ingestion-level tier gate.

05
DELIVERY LAYER

Astro SSR, Clerk auth, Stripe billing, real-time streaming SSE

The frontend is built on Astro with server-side rendering. Authentication is handled by Clerk, supporting email/password, Google OAuth, and enterprise SSO. Billing is Stripe-native with plan gating enforced server-side. Answers are streamed to the client via Server-Sent Events so the first token appears within 400ms. All delivery is HTTPS. No client-side tier enforcement, access decisions are made server-side on every request.

THE FOUNDATION

Built on practitioners,
not parameters.

The performance of any RAG system is bounded by the quality of what it retrieves. Most AI products paper over low-quality source content by increasing model size. KnowledgeBricks inverts the assumption: the quality floor is the vault, not the model.

Every article in the vault answers a real question a practitioner has asked in the field, not a hypothetical, not a textbook case. When your team queries the platform, the answer quality correlates directly to the experience of the person who wrote the source article.

  • 140+ practitioner articles across logistics, supply chain, and estimating
  • Every article reviewed against real operational benchmarks before publishing
  • Vault maintained continuously, outdated benchmarks are retired, not silently left in
  • Custom portals ingest your internal SOPs through the same pipeline
Query Flow, Live System
1
User query received
Clerk auth verified · tier determined server-side
2
Query embedded
text-embedding-3-large · 3,072 dimensions
3
Vector search (pgvector)
Top-8 chunks · tier-filtered · cosine similarity
4
LLM synthesis (Claude Sonnet)
Citation enforcement · paywall integrity · no hallucination safety
5
Answer streamed to client
SSE · first token <400ms · sources attached
TECHNICAL DIFFERENTIATORS

Six reasons the architecture
earns the trust it asks for.

Practitioner-authored vault

Every source article is written by a domain expert, not generated, not scraped. The content quality ceiling is set by human expertise, not training data scale.

Citation-enforced answers

The LLM system prompt enforces citation on every factual claim. Answers without retrievable source evidence are not generated, gaps are surfaced explicitly.

Paywall integrity

Locked content is never sent to the LLM. The tier gate is enforced at the ingestion layer, making it structurally impossible for a prompt injection to access paid content without a valid subscription.

Role-based access tiers

Clerk-managed authentication with server-side tier enforcement on every request. Role assignments are propagated through the retrieval layer, not just the UI.

Streaming real-time answers

Server-Sent Events deliver the first token within 400ms. Users see the answer forming in real time, no loading spinners, no waiting for full generation to complete before display.

Ready to see the platform
in action?

A 30-minute demo covers live query performance, the ingestion pipeline for a custom portal, and how paywall integrity holds under adversarial prompting.

Deployment in under 30 days. No systems replaced. SOC 2 Type II certified.