One of AIThreads' core features is the knowledge base — upload documents and your AI agent can semantically search them to find relevant context. Here's how we built it.

The Problem

Traditional keyword search fails when users phrase things differently than your documentation. "How do I get a refund?" should match your "Returns and Exchanges Policy" even if the word "refund" doesn't appear.

Semantic search solves this by comparing meaning rather than exact text.

Our Architecture

When you upload a document:

  1. We extract text from PDFs, Word docs, and other formats
  2. Split the content into overlapping chunks (512 tokens with 50 token overlap)
  3. Generate embeddings for each chunk using OpenAI's text-embedding-3-small
  4. Store vectors in pgvector alongside the original text

Why pgvector?

We evaluated dedicated vector databases but chose pgvector for several reasons:

  • Simpler infrastructure — one database for everything
  • ACID transactions — embeddings update atomically with documents
  • Good enough performance for our scale (sub-50ms queries)
  • Native PostgreSQL means we can join with other tables

The Query Pipeline

// Incoming query
const query = "refund policy";

// Generate embedding
const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: query
});

// Vector similarity search
const results = await db.query(`
  SELECT content, 1 - (embedding <=> $1) as similarity
  FROM document_chunks
  WHERE inbox_id = $2
  ORDER BY embedding <=> $1
  LIMIT 5
`, [embedding, inboxId]);

Performance Optimizations

We use HNSW indexes for approximate nearest neighbor search, which trades a tiny bit of accuracy for 10x faster queries. For most use cases, the difference is imperceptible.

We also cache embeddings for common queries, reducing latency for repeated searches.