DAVID — PIPELINE INTELLIGENCE

DAVID

Every transaction makes you smarter than your Goliath competitors.

DAVID is a 5-layer AI pipeline stack: Sesh (gateway) → 626 (orchestration) → Stitch (intelligence) → Tinker (fine-tuning). Named after Disney's Lilo & Stitch. Every layer is a character. Together, they form a self-improving intelligence engine that compounds with every interaction.

Pipeline layers

Models in catalog

Unified AI providers

50+

Prompt templates

What it does

Intelligent model routing

ModelRouter scores 14 models by task type (chat/code/eval), complexity, and cost ceiling. Returns best pick + 3 fallbacks with reasoning. Claude for reasoning, GPT for generation, Qwen for bulk — automatically.

Prompt registry with versioning

PromptRegistry manages 50+ Handlebars templates. Register, resolve with variables, fork, A/B test. Version control every prompt change. Roll back bad prompts in one call.

Quality-gated generation

QualityGate runs an LLM judge panel scoring accuracy, relevance, completeness, clarity, and safety. SelfRefineLoop auto-retries on failure — feeding back refinement prompts until quality threshold is met.

Token-aware context management

ContextManager fits conversations into any model's token budget. Extractive summarization preserves key topics. Truncates tool results proportionally. Never loses critical context.

Durable workflow orchestration (626)

Temporal-powered workflows that survive server crashes. ChatWorkflow and IntelligenceWorkflow run multi-step agent loops with retries, timeouts, and distributed task queues.

Closed-loop fine-tuning (Tinker)

Every interaction → Langfuse trace → training data → Dagster pipeline → TML fine-tuned model. Cost trajectory: $50/mo → $5 → $3 via progressive self-hosted models.

Centralized API gateway (Sesh)

SHA-256 hashed API keys. Sliding window rate limiting. Per-key cost tracking with monthly budgets. Multi-provider proxy with automatic failover. SDKs for JS, Python, Go, Rust.

Full observability

Every LLM call traced via Langfuse: prompt version, model, provider, input/output tokens, cost, latency, tool calls, errors. Debug any response back to its source in seconds.

How it works

01

Request enters through Sesh

API gateway authenticates, rate-limits, and routes the request. Cost is pre-estimated and budget-checked before any tokens are spent.

02

Stitch selects the optimal model

ModelRouter analyzes the task (chat? code? SQL?), scores 14 models by complexity and cost, and picks the best candidate with 3 fallbacks.

03

626 orchestrates the workflow

Temporal workflows manage the full lifecycle: context fitting → prompt resolution → generation → tool execution → quality evaluation. Durable and retryable.

04

Quality gate validates the response

QualityGate scores the output. If it fails: SelfRefineLoop retries with feedback. Only responses that pass quality thresholds reach the user.

05

Tinker closes the learning loop

Langfuse traces + user feedback → training data. Dagster orchestrates batch fine-tuning via TML. The next response is better because of this one.

Why it's different

Five-layer architecture (Sesh → 626 → Stitch → Tinker → Observe) — not a monolith, each layer independently upgradable

Self-improving: every interaction feeds the fine-tuning pipeline, compounding quality over time

Provider-agnostic: swap between Anthropic, OpenAI, Moonshot, Ollama, or custom fine-tuned models without code changes

Cost trajectory from $50/month to $3/month via progressive self-hosting and fine-tuning

Temporal-based workflows guarantee completion even through crashes and network failures

Named after David vs. Goliath — small teams beat enterprise competitors through compounding intelligence

Ready to try DAVID?

Start Building →