Legionis: Architecture Plan v3 (Definitive)

Document Version: 3.1 Date: 2026-02-18 Owner: Chief Architect (Architecture Team) Status: Active — Definitive Architecture Document Product: Legionis

Incorporates: OS v3.0.0, Extension Teams v3, Vercel AI SDK decision, Vercel Fluid Compute analysis, System Prompt Architecture deep dive, Team Personalities infrastructure, PLT scope expansion decision, Platform Architecture Deck (4-layer model, defensibility, flywheel)


0. Conceptual Platform Architecture (4-Layer Model)

The platform is organized into four conceptual layers. Each layer is independently valuable; together they form a compounding system. This model governs how we talk about the architecture externally (positioning, sales, investor conversations) and how we reason about defensibility internally.

0.1 The Four Layers

┌─────────────────────────────────────────────────────────────────┐
│  LAYER 4: AGENT LAYER                                           │
│  81 specialists | 10 departments | Modular provisioning         │
│  Cross-department intelligence                                   │
│  Defensibility: MEDIUM (deep SKILL.md, knowledge packs)          │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 3: COMMUNICATION LAYER  ★ KEY DIFFERENTIATOR             │
│  Intelligent routing | Agent-to-agent cascade                    │
│  Cooperation profiles | Invisible presentation                   │
│  Conversation context continuity                                 │
│  "The nervous system of the AI workforce"                        │
│  Defensibility: HIGH (no competitor has this)                    │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 2: CONTEXT LAYER                                         │
│  Decisions & strategic bets | Feedback loop                      │
│  Cross-reference graph | Auto-context injection                  │
│  Stored in USER's cloud | Compounding value                     │
│  Month 1: Useful → Month 3: Valuable → Month 6+: Irreplaceable │
│  Defensibility: VERY HIGH (organic switching cost)               │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 1: COMPUTE LAYER                                         │
│  BYOT (zero markup) | Quality Toggle (Haiku/Sonnet/Opus)        │
│  Managed Tokens (15% prepaid Token Banks per DR-2026-004)        │
│  Defensibility: LOW (commoditized by design)                     │
└───────────┬──────────────────┬──────────────────┬───────────────┘
            │                  │                  │
            ▼                  ▼                  ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │ LLM Providers │  │Cloud Storage │  │Cloud Services│
    │ Anthropic     │  │Google Drive  │  │Stripe, Clerk │
    │ OpenAI        │  │OneDrive      │  │Neon, R2      │
    │ (user's keys) │  │Dropbox       │  │Typesense     │
    └──────────────┘  └──────────────┘  └──────────────┘

0.2 Layer Descriptions

LayerRoleMaps To (Implementation)
Agent81 specialists across 10 departments with deep SKILL.md files, knowledge packs, and team personalities. Modular provisioning lets users assemble exactly the team they need. Cross-department intelligence means agents from different teams can be invoked together.Sections 2 (Agent Runtime), 3 (Prompt Compilation), 8 (Team Personalities), 9 (Knowledge Packs), 14 (Agent Roster)
CommunicationThe nervous system. Intelligent routing sends requests to the right agent. Agent-to-agent cascade enables delegation (consultation, delegation, review, debate). Cooperation profiles define how agents interact. Invisible presentation means the user sees a unified team, not plumbing. Conversation context continuity ensures agents share the thread.Sections 4 (Gateway Orchestration), 2.4 (Sub-Agent Spawning), 7 (API Routes)
ContextOrganizational memory that compounds. Decisions, strategic bets, assumptions, feedback, and learnings are stored in the user's cloud and cross-referenced. Auto-context injection means agents automatically recall relevant history before producing deliverables.Sections 5 (Context Layer), 10 (Cloud Storage)
ComputeThe transparent foundation. BYOT means users bring their own API keys with zero markup. Quality Toggle lets users choose cost/quality tradeoff (Haiku for speed, Sonnet for balance, Opus for depth). Managed Tokens via 15% prepaid Token Banks (DR-2026-004) provide a convenience option.Sections 2.2 (BYOT Routing), 1.2 (Technology Stack)

0.3 Three External Connections

The platform connects to three categories of external services:

ConnectionServicesLayerUser Owns?
LLM ProvidersAnthropic, OpenAI (more via Vercel AI SDK)ComputeYes (BYOT keys)
Cloud StorageGoogle Drive (MVP), OneDrive, Dropbox (Growth)ContextYes (their cloud account)
Cloud ServicesStripe, Clerk, Neon, R2, Typesense, Sentry, PostHogAll layersNo (platform infrastructure)

0.4 Defensibility Assessment

LayerDefensibilityRationale
ComputeLowCommoditized by design. BYOT and cloud storage are transparency features, not moats. We chose trust over lock-in.
AgentMedium81 agents with deep SKILL.md files (300-440 lines each), 34 knowledge packs, and team personalities. Significant effort to replicate, but ultimately copyable given time.
CommunicationHighIntelligent routing, agent-to-agent cascade, cooperation profiles, and invisible presentation. No competitor has built this. "Other platforms give you 10 specialists in 10 separate rooms. Legionis puts them in the same room."
ContextVery HighOrganizational memory compounds organically through usage. After 3+ months of decisions, bets, feedback, and learnings, switching cost is earned, not engineered. The cross-reference graph creates intelligence that is unique to each customer's org.

Core thesis: "Intelligence is a commodity. Coordination is the moat."

0.5 Compounding Flywheel

The four layers create a reinforcing cycle:

User starts with one team (Agent Layer)
  → Agents collaborate via Communication Layer
    → Deliverables save to user's cloud (Context Layer)
      → Context accumulates across interactions
        → Trust deepens (they own everything)
          → Pay only what they use via BYOT (Compute Layer)
            → User adds more teams → cycle deepens

Each turn of the flywheel increases the value of all layers. The longer a user stays, the more irreplaceable the Context Layer becomes, and the more valuable the Communication Layer's ability to leverage that context across agents.


1. System Architecture Overview

1.1 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                              CLIENT (Browser)                               │
│  Next.js 14+ App Router │ Tailwind + Radix │ Zustand + TanStack Query      │
│  Tiptap Editor │ EventSource (SSE) │ Meeting Mode Renderer                  │
└────────────────────────────────┬────────────────────────────────────────────┘
                                 │ HTTPS / SSE
                                 ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                        VERCEL (Next.js API Routes)                          │
│                                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │
│  │ /api/chat     │  │ /api/plt     │  │ /api/gateway │  │ /api/skill   │   │
│  │ maxDur: 60s   │  │ maxDur: 120s │  │ maxDur: 120s │  │ maxDur: 30s  │   │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘   │
│         │                  │                  │                  │           │
│         ▼                  ▼                  ▼                  ▼           │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                     AGENT RUNTIME ENGINE                            │    │
│  │                                                                     │    │
│  │  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  │    │
│  │  │ Prompt Compiler   │  │ Tool Registry     │  │ Delegation       │  │    │
│  │  │ (3-layer cached)  │  │ (Cloud-safe)      │  │ Engine           │  │    │
│  │  └──────────────────┘  └──────────────────┘  └──────────────────┘  │    │
│  │                                                                     │    │
│  │  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  │    │
│  │  │ Vercel AI SDK     │  │ BYOT Key Router   │  │ Auto-Context     │  │    │
│  │  │ generateText()    │  │ Per-request keys   │  │ Injector         │  │    │
│  │  │ streamText()      │  │ 24+ providers      │  │ (DB-backed)      │  │    │
│  │  └──────────────────┘  └──────────────────┘  └──────────────────┘  │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐       │
│  │ Clerk Auth   │  │ Stripe      │  │ Sentry      │  │ PostHog     │       │
│  │ JWT + Orgs   │  │ Billing     │  │ Errors      │  │ Analytics   │       │
│  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────┘       │
└────────────────────────────────────┬────────────────────────────────────────┘
                                     │
                 ┌───────────────────┼───────────────────┐
                 │                   │                   │
                 ▼                   ▼                   ▼
┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│   Neon PostgreSQL │  │  Cloudflare R2   │  │   Typesense      │
│   (Context layer) │  │  (Prompts/files) │  │   (Full-text)    │
│   RLS per tenant  │  │  Zero egress     │  │   (Search)       │
└──────────────────┘  └──────────────────┘  └──────────────────┘
                 │
                 ▼
┌──────────────────────────────────────────────────────────────────┐
│                   CLOUD STORAGE (User's Data)                    │
│  Google Drive API v3  │  OneDrive Graph API  │  Dropbox API v2  │
│  (MVP)                │  (Growth)            │  (Growth)        │
└──────────────────────────────────────────────────────────────────┘

1.2 Technology Stack Summary

LayerTechnologyVersionRationale
FrontendNext.js (App Router)14+SSR, streaming, Vercel-native
StylingTailwind CSS + Radix UI4.xUtility-first, accessible
StateZustand + TanStack QueryLatestMinimal client, smart server
EditorTiptap (ProseMirror)LatestMarkdown, collaborative-ready
Agent SDKVercel AI SDK (ai)6.xPure API, multi-provider, 7.87M/wk
Providers@ai-sdk/anthropic + @ai-sdk/openaiLatestClaude primary, OpenAI secondary
DatabaseNeon PostgreSQL16Serverless, branching, scale-to-zero
ORMDrizzleLatestType-safe, minimal abstraction
StorageCloudflare R2N/AS3-compatible, zero egress
SearchTypesense CloudLatestTypo-tolerant, faceted
AuthClerkLatestPre-built UI, social login, orgs
PaymentsStripe BillingLatestUsage-based, checkout, portal
HostingVercel (Fluid Compute)LatestNext.js native, 300s default timeout
MonitoringSentry + Better StackLatestErrors + logs + uptime
AnalyticsPostHogFree tierProduct analytics, feature flags

1.3 Infrastructure Cost

ServicePlanMonthly Cost
Vercel Pro1 seat$20
Neon LaunchPostgreSQL$19
Cloudflare R2~10GB$0.15
Typesense Starter0.5GB RAM$40
Clerk Pro10K MAU free$25
Sentry TeamError tracking$29
PostHogFree tier$0
Better StackStarter$29
Total~$162/mo

AI costs: $0 (users bring their own API keys)


2. Agent Runtime Architecture

2.1 Core Runtime: Vercel AI SDK

The agent runtime is built on Vercel AI SDK v6.x. Each agent interaction is a generateText() or streamText() call with custom tools and system prompts.

// lib/agent-runtime.ts
import { generateText, streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { openai } from '@ai-sdk/openai';

export async function invokeAgent(params: AgentInvocation): Promise { const { agentKey, userMessage, workspaceId, apiKey, provider } = params;

// 1. Compile system prompt (3-layer cached) const systemPrompt = await compilePrompt(agentKey, workspaceId, userMessage);

// 2. Select model based on user's provider + key const model = selectModel(provider, apiKey);

// 3. Load cloud-safe tools for this agent const tools = loadTools(agentKey, workspaceId);

// 4. Execute agent const result = await generateText({ model, system: systemPrompt, tools, maxSteps: 10, prompt: userMessage, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, onStepFinish({ text, toolCalls, toolResults, usage }) { // Track per-step metrics trackStepMetrics(params.traceContext, { toolCalls, usage }); }, });

// 5. Post-processing await postProcess(result, params);

return formatResult(result); }

2.2 Model Selection and BYOT Routing

// lib/model-router.ts
function selectModel(provider: string, apiKey: string) {
  switch (provider) {
    case 'anthropic':
      return anthropic({ apiKey })('claude-sonnet-4-5');
    case 'openai':
      return openai({ apiKey })('gpt-4o');
    default:
      throw new Error(Unsupported provider: ${provider});
  }
}

Users configure their API keys per-provider. The key is decrypted at runtime from the user_api_keys table (envelope encryption with KMS) and passed per-request. This means:

2.3 Cloud-Safe Tool Definitions

Six custom tools replace the CLI's local filesystem operations:

// lib/tools/index.ts
import { tool } from 'ai';
import { z } from 'zod';

export function createAgentTools(workspace: Workspace) { return { readFile: tool({ description: 'Read a file from the workspace cloud storage', parameters: z.object({ filePath: z.string().describe('Relative path within workspace'), fileId: z.string().optional().describe('Cloud storage file ID'), }), execute: async ({ filePath, fileId }) => { const id = fileId || await resolvePathToId(filePath, workspace); return await cloudStorage.readFile(id, workspace); }, }),

writeFile: tool({ description: 'Write a file to the workspace cloud storage', parameters: z.object({ filePath: z.string(), content: z.string(), }), execute: async ({ filePath, content }) => { const fileId = await cloudStorage.upsertFile(filePath, content, workspace); await searchIndex.upsertDocument(fileId, content, workspace.id); return { fileId, path: filePath }; }, }),

editFile: tool({ description: 'Make string replacements in a workspace file', parameters: z.object({ filePath: z.string(), oldString: z.string(), newString: z.string(), }), execute: async ({ filePath, oldString, newString }) => { const content = await cloudStorage.readFile(filePath, workspace); const updated = content.replace(oldString, newString); return await cloudStorage.updateFile(filePath, updated, workspace); }, }),

globFiles: tool({ description: 'Find files matching a pattern in the workspace', parameters: z.object({ pattern: z.string(), }), execute: async ({ pattern }) => { return await cloudStorage.listFiles(pattern, workspace); }, }),

grepContent: tool({ description: 'Search file contents in the workspace', parameters: z.object({ query: z.string(), filePattern: z.string().optional(), }), execute: async ({ query, filePattern }) => { return await typesense.search(query, workspace.id, filePattern); }, }),

spawnSubAgent: tool({ description: 'Spawn a sub-agent for specialized input', parameters: z.object({ agentId: z.string(), task: z.string(), delegationPattern: z.enum([ 'consultation', 'delegation', 'review', 'debate' ]).optional(), }), execute: async ({ agentId, task, delegationPattern }, context) => { return await spawnSubAgent({ agentId, task, pattern: delegationPattern || 'consultation', parentContext: context, workspace, }); }, }), }; }

2.4 Sub-Agent Spawning (Delegation Protocol)

The OS v3 delegation protocol maps directly to the SaaS runtime:

// lib/delegation.ts
async function spawnSubAgent(params: SubAgentParams): Promise {
  const { agentId, task, pattern, parentContext, workspace } = params;

// Enforce depth limit (max 2 levels) if (parentContext.depth >= 2) { return { error: 'Max sub-agent depth reached. Provide analysis inline.' }; }

// Load sub-agent persona const persona = await loadAgentPersona(agentId);

// Build system prompt with delegation context const systemPrompt = await compilePrompt(agentId, workspace.id, task, { delegationPattern: pattern, parentAgent: parentContext.agentKey, });

// Execute sub-agent const result = await generateText({ model: selectModel(parentContext.provider, parentContext.apiKey), system: systemPrompt, tools: createAgentTools(workspace), maxSteps: 5, // Sub-agents get fewer steps prompt: buildDelegationPrompt(pattern, task), });

// Log sub-agent invocation with parent trace await logInvocation({ type: 'agent', agentOrSkill: agentId, parentSpanId: parentContext.spanId, requestId: parentContext.requestId, ...result.usage, });

return { response: result.text, agentId }; }

function buildDelegationPrompt(pattern: string, task: string): string { const prefixes: Record = { consultation: task, delegation: [DELEGATION] ${task}, review: [REVIEW] ${task}, debate: [DEBATE] ${task}, }; return prefixes[pattern] || task; }


3. System Prompt Compilation Pipeline

3.1 Three-Layer Cached Architecture

The system prompt is structured for maximum cache efficiency with Anthropic's prompt caching:

┌──────────────────────────────────────────────────────────────┐
│  Layer 1: CORE PROTOCOL (~1,500 tokens)                      │
│  - Always cached (cache_control: ephemeral)                  │
│  - Compiled from 10 core rules into single document          │
│  - Identical across all agents                               │
│  - Changes only on OS version updates                        │
├──────────────────────────────────────────────────────────────┤
│  Layer 2: AGENT PERSONA (~500 tokens)                        │
│  - Cached per agent type per session                         │
│  - Extracted from SKILL.md identity sections                 │
│  - Includes team personality injection point                 │
│  - Changes only on agent definition updates                  │
├──────────────────────────────────────────────────────────────┤
│  Layer 3: TASK CONTEXT (~200-500 tokens)                     │
│  - Domain rules (conditional on skill being invoked)         │
│  - Auto-injected context (decisions, feedback, bets)         │
│  - User's org context (company name, product areas)          │
│  - Team personality principles (if configured)               │
│  - Changes per request                                       │
└──────────────────────────────────────────────────────────────┘

Total per call: ~2,200-2,500 tokens (vs. ~19,650 naive) Cost with caching: $0.71/mo per user (Sonnet) vs $35/mo naive

3.2 Build Pipeline: compile-prompts.ts

The build pipeline reads canonical SKILL.md files and compiles them for SaaS deployment. This follows the "one source, two build targets" principle: the same SKILL.md files serve both CLI (full content) and SaaS (extracted/compressed).

// scripts/compile-prompts.ts
// Run at build time or on OS version update

import { readFile, writeFile } from 'fs/promises'; import { glob } from 'fast-glob';

interface CompiledPrompt { agentKey: string; layer1CoreProtocol: string; // Shared across all agents layer2Persona: string; // Per-agent identity domainRules: Record; // Loaded conditionally metadata: AgentMetadata; }

// Step 1: Compile Core Protocol from 10 Tier 1 rules async function compileCoreProtocol(): Promise { const rules = [ 'agent-spawn-protocol.md', // Response format, identity 'no-estimates.md', // No fabricated numbers 'v2v-flow.md', // 6-phase summary 'context-management.md', // Save/recall/capture 'intelligent-routing.md', // Domain routing 'delegation-protocol.md', // 4 delegation patterns 'principles-enforcement.md', // 8 operating principles 'meeting-mode.md', // Multi-agent presentation 'parallel-execution.md', // When to parallelize 'skill-awareness.md', // Omitted (covered by persona) ];

// Read and extract essential sections from each rule // Compile into a single ~1,500 token document const sections = await Promise.all( rules.map(r => extractEssentials(rules/${r})) );

return ## Agent Operating Protocol\n\n${sections.join('\n\n')}; }

// Step 2: Extract agent persona from SKILL.md async function extractPersona(skillPath: string): Promise { const content = await readFile(skillPath, 'utf-8'); const parsed = parseSkillMd(content);

// Extract ONLY identity-essential sections (~80-100 lines → ~500 tokens) return [ # ${parsed.emoji} ${parsed.displayName}, '', ## Identity, parsed.coreAccountability, '', ## How I Think, parsed.howIThink, // 3-5 bullets, unique per agent '', ## RACI, parsed.raci, // A/R/C items '', ## Key Deliverables, parsed.deliverables, // Table of 4-5 items '', ## Collaboration, parsed.collaboration, // 3-4 key relationships '', ## Primary Skills, parsed.skills, // 5-7 skills with when-to-use '', ## V2V Phase, parsed.primaryPhases, ].join('\n'); }

// Step 3: Compile domain rules (Tier 2) async function compileDomainRules(): Promise> { return { decisions: await condense('rules/decision-system.md', 300), strategy: await condense('rules/strategy-documents.md', 300), roadmaps: await condense('rules/roadmaps.md', 300), gtm: await condense('rules/gtm-documents.md', 300), requirements: await condense('rules/requirements.md', 300), context: await condense('rules/auto-context.md', 200) + '\n' + await condense('rules/context-graph.md', 200), }; }

// Step 4: Main compilation async function main() { const coreProtocol = await compileCoreProtocol();

// All OS agents const osAgentPaths = await glob('skills/*/SKILL.md'); const osPersonas = await Promise.all( osAgentPaths.map(async path => ({ key: extractAgentKey(path), persona: await extractPersona(path), metadata: await extractMetadata(path), })) );

// All Extension Team agents const extAgentPaths = await glob('Extension Teams/*/SKILL.md'); const extPersonas = await Promise.all( extAgentPaths.map(async path => ({ key: extractAgentKey(path), persona: await extractPersona(path), metadata: await extractMetadata(path), })) );

const domainRules = await compileDomainRules();

// Output compiled prompts for SaaS deployment const output: CompiledPrompts = { coreProtocol, agents: [...osPersonas, ...extPersonas], domainRules, version: getOsVersion(), compiledAt: new Date().toISOString(), };

// Write to R2-deployable format await writeFile('compiled/prompts.json', JSON.stringify(output, null, 2));

// Also write SQL migration for prompt_templates table await generatePromptMigration(output);

console.log(Compiled ${output.agents.length} agents); console.log(Core protocol: ${countTokens(coreProtocol)} tokens); console.log(Domain rules: ${Object.keys(domainRules).length} domains); }

3.3 Runtime Prompt Assembly

// lib/prompt-compiler.ts
import { getCompiledPrompts } from './compiled-prompts';

const promptCache = new Map();

export async function compilePrompt( agentKey: string, workspaceId: string, userMessage: string, options?: { delegationPattern?: string; parentAgent?: string; teamPersonality?: TeamPersonality; } ): Promise { const compiled = getCompiledPrompts(); const messages: SystemMessage[] = [];

// Layer 1: Core Protocol (always cached, identical across agents) messages.push({ role: 'system', content: compiled.coreProtocol, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, });

// Layer 2: Agent Persona (cached per agent type) const agent = compiled.agents.find(a => a.key === agentKey); if (!agent) throw new Error(Unknown agent: ${agentKey});

let personaContent = agent.persona;

// Inject team personality if configured (see Section 8) if (options?.teamPersonality) { personaContent += \n\n## Team Operating Principles\n${options.teamPersonality.principles}; }

messages.push({ role: 'system', content: personaContent, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, });

// Layer 3: Task Context (per-request, partially cached) const taskContext = await buildTaskContext( agentKey, workspaceId, userMessage, options );

if (taskContext) { messages.push({ role: 'system', content: taskContext, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, }); }

return messages; }

async function buildTaskContext( agentKey: string, workspaceId: string, userMessage: string, options?: any ): Promise { const parts: string[] = [];

// Domain rules (based on detected skill) const skill = detectSkillFromMessage(userMessage); if (skill) { const domain = skillToDomain(skill); const compiled = getCompiledPrompts(); if (compiled.domainRules[domain]) { parts.push(compiled.domainRules[domain]); } }

// Auto-context injection (from database) const topics = extractTopics(userMessage); if (topics.length > 0) { const context = await queryAutoContext(workspaceId, topics); if (context) parts.push(context); }

// Delegation context if (options?.delegationPattern) { parts.push(Delegation: [${options.delegationPattern.toUpperCase()}] from ${options.parentAgent}); }

return parts.length > 0 ? parts.join('\n\n---\n\n') : null; }

3.4 Token Budget Enforcement

ComponentTargetHard LimitNotes
Core Protocol (L1)1,5002,000Must exceed 1,024 for Sonnet caching
Agent Persona (L2)500700Identity-essential content only
Domain Rules (L3)300500Loaded conditionally per skill
Auto-Context (L3)200500Max 5 context items
Team Personality (L3)100200Principles injection
Total System Prompt2,5003,700Per API call

3.5 Cost Impact (Anthropic Prompt Caching)

ScenarioSonnet Monthlyvs. Naive
Full uncompressed, no caching$35.37Baseline
Compressed, no caching$4.50-87%
Compressed + cached$0.71-98%


4. Gateway Orchestration (PLT Meeting Mode)

4.1 Architecture

Gateways coordinate multi-agent sessions. The PLT (Product Leadership Team) is the most complex, spawning 3-4 agents in parallel and synthesizing their perspectives.

User Request: "@plt Should we delay launch for SSO?"
                    │
                    ▼
┌──────────────────────────────────────────────────────────────┐
│                    PLT GATEWAY HANDLER                        │
│                    maxDuration: 120s                          │
│                                                              │
│  1. Assess complexity → FULL PLT                             │
│  2. Select agents: vp-product, dir-pm, dir-pmm, prod-ops    │
│  3. Auto-context: query "launch", "SSO" from context DB     │
│                                                              │
│  4. PARALLEL EXECUTION (Promise.all)                         │
│     ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐    │
│     │ VP Prod   │ │ Dir PM   │ │ Dir PMM  │ │ ProdOps  │    │
│     │ maxSteps:5│ │ maxSteps:5│ │ maxSteps:5│ │ maxSteps:5│  │
│     │ ~15-25s   │ │ ~15-25s  │ │ ~15-25s  │ │ ~15-25s  │    │
│     └──────────┘ └──────────┘ └──────────┘ └──────────┘    │
│              Wall clock: max(agent times) ≈ 20-30s           │
│                                                              │
│  5. FORMAT: Meeting Mode (each agent speaks in first person) │
│  6. SYNTHESIZE: VP Product summarizes agreement/tension      │
│  7. LOG: agent_invocations table + ROI tracking              │
└──────────────────────────────────────────────────────────────┘

4.2 Implementation

// app/api/plt/route.ts
export const maxDuration = 120; // 2 minutes, generous for PLT

export async function POST(request: Request) { const { topic, workspaceId } = await request.json(); const { apiKey, provider } = await getUserApiKey(request); const model = selectModel(provider, apiKey);

// Assess complexity and select agents const agentIds = assessPLTComplexity(topic); // e.g., ['vp-product', 'director-product-management', // 'director-product-marketing', 'product-operations']

// Auto-context injection const autoContext = await queryAutoContext(workspaceId, extractTopics(topic));

// Phase 1: Parallel agent execution const agentResults = await Promise.all( agentIds.map(agentId => { const systemPrompt = compilePrompt(agentId, workspaceId, topic); return generateText({ model, system: systemPrompt, tools: createAgentTools(workspace), maxSteps: 5, prompt: ${autoContext ? ## Auto-Context\n${autoContext}\n\n : ''}${topic}, }); }) );

// Phase 2: Format Meeting Mode const meetingMode = formatMeetingMode(agentIds, agentResults);

// Phase 3: Synthesis (optional — can also stream) const synthesizer = agentIds[0]; // VP Product typically synthesizes const synthesis = await generateText({ model, system: You are ${getAgentIdentity(synthesizer).displayName}. Synthesize the PLT discussion., prompt: Synthesize these perspectives:\n\n${meetingMode}, maxSteps: 1, });

// Post-processing await logGatewaySession({ type: 'gateway', gateway: 'plt', agentsSpawned: agentIds, requestId: createRequestId(), workspaceId, });

return Response.json({ meetingMode, synthesis: synthesis.text, roi: calculatePLTRoi(agentResults), }); }

4.3 Vercel Constraints (Resolved)

ConcernResolution
TimeoutFluid Compute: 300s default, 800s max. PLT targets <60s p95. 5x headroom.
Memory4GB / 2 vCPU on Pro. Adequate for 4 parallel agents.
Cost~$0.00035/session (Active CPU billing pauses during I/O waits).
Payload4.5MB limit is client→Vercel only. LLM calls are server-side.
Rate limitsPer-user BYOT keys. Max 1 PLT session at a time per user.

4.4 maxDuration Configuration

// Per-route timeout configuration
// app/api/chat/route.ts
export const maxDuration = 60;      // Single agent: 1 minute

// app/api/plt/route.ts export const maxDuration = 120; // PLT sessions: 2 minutes

// app/api/gateway/[gateway]/route.ts export const maxDuration = 120; // All gateways: 2 minutes

// app/api/skill/[skill]/route.ts export const maxDuration = 30; // Skills: 30 seconds

4.5 maxSteps Configuration

ContextmaxStepsRationale
Single agent (standalone)10Full tool loop capability
PLT-spawned agents5Bounds total PLT time
Skill execution1Skills are single-step
Sub-agent (delegation)5Focused tasks


5. Context Layer Implementation

5.1 CLI-to-Cloud Mapping

The context layer migrates from flat files to PostgreSQL while preserving the same semantics:

CLI OperationCLI ImplementationCloud Implementation
/context-saveParse + update index.md + write file + update index.jsonINSERT into decisions/bets/learnings + INSERT cross_references
/context-recallRead index.json + filter topicsSELECT with GIN index on topics[] + Typesense
/portfolio-statusRead active-bets.mdSELECT portfolio_state JOIN strategic_bets
/feedback-captureWrite to context/feedback/INSERT into feedback + auto-theme matching
Auto-registrationWrite to documents/index.mdINSERT into documents table
Cross-referencesUpdate crossReferences in index.jsonINSERT into cross_references table

5.2 Auto-Context Injection (Database-Backed)

Before agents produce deliverables, relevant context is automatically injected:

// lib/auto-context.ts
export async function queryAutoContext(
  workspaceId: string,
  topics: string[]
): Promise {
  if (topics.length === 0) return null;

// Query decisions, bets, and feedback in parallel const [decisions, bets, feedback] = await Promise.all([ db.select() .from(schema.decisions) .where(and( eq(schema.decisions.workspaceId, workspaceId), arrayOverlaps(schema.decisions.topics, topics), isNull(schema.decisions.archivedAt), )) .orderBy(desc(schema.decisions.createdAt)) .limit(5),

db.select() .from(schema.strategicBets) .where(and( eq(schema.strategicBets.workspaceId, workspaceId), arrayOverlaps(schema.strategicBets.topics, topics), eq(schema.strategicBets.status, 'active'), )) .limit(3),

db.select() .from(schema.feedback) .where(and( eq(schema.feedback.workspaceId, workspaceId), arrayOverlaps(schema.feedback.topics, topics), )) .orderBy(desc(schema.feedback.createdAt)) .limit(5), ]);

if (!decisions.length && !bets.length && !feedback.length) { return null; }

return formatAutoContext({ decisions, bets, feedback }); }

5.3 Cross-Reference Graph (Relational)

The CLI's JSON-based cross-references become a proper relational graph:

-- Find all connected items for a decision (bidirectional, 1 hop)
SELECT
  CASE
    WHEN cr.source_type = 'decision' AND cr.source_id = $1 THEN cr.target_type
    ELSE cr.source_type
  END as related_type,
  CASE
    WHEN cr.source_type = 'decision' AND cr.source_id = $1 THEN cr.target_id
    ELSE cr.source_id
  END as related_id,
  cr.relationship
FROM cross_references cr
WHERE cr.workspace_id = $2
  AND (
    (cr.source_type = 'decision' AND cr.source_id = $1)
    OR (cr.target_type = 'decision' AND cr.target_id = $1)
  );

5.4 Interaction Logging

Agent invocations are logged to the agent_invocations table with distributed tracing:

// lib/interaction-logger.ts
export async function logInvocation(params: InvocationLog) {
  await db.insert(schema.agentInvocations).values({
    workspaceId: params.workspaceId,
    userId: params.userId,
    conversationId: params.conversationId,
    invocationType: params.type,
    agentOrSkill: params.agentOrSkill,
    requestSummary: params.requestSummary,
    status: params.status,
    requestId: params.requestId,
    spanId: params.spanId,
    parentSpanId: params.parentSpanId,
    modelUsed: params.modelUsed,
    tokensIn: params.tokensIn,
    tokensOut: params.tokensOut,
    durationMs: params.durationMs,
    toolsUsed: params.toolsUsed,
    agentsSpawned: params.agentsSpawned,
    complexity: params.complexity,
    roiMinutesSaved: params.roiMinutesSaved,
    filesCreated: params.filesCreated,
    contextEntriesCreated: params.contextEntriesCreated,
  });
}


6. Data Model Updates (v3 Additions)

6.1 Team Personalities Table (NEW)

-- Team personality definitions
CREATE TABLE team_personalities (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  slug            VARCHAR(50) NOT NULL UNIQUE,    -- 'v2v-operators', 'user-obsessed', etc.
  team            VARCHAR(50) NOT NULL,           -- 'product', 'design', 'architecture', 'marketing'
  name            VARCHAR(255) NOT NULL,          -- "Vision to Value Operators"
  personality_tag VARCHAR(100) NOT NULL,          -- 2-3 word descriptor
  philosophy      TEXT NOT NULL,                  -- 1-2 paragraph worldview
  principles      JSONB NOT NULL DEFAULT '[]',    -- Array of {id, name, statement, enforcement}
  version         VARCHAR(20) NOT NULL DEFAULT '1.0.0',
  is_default      BOOLEAN NOT NULL DEFAULT false, -- Default personality for the team
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Default personalities for each team INSERT INTO team_personalities (slug, team, name, personality_tag, philosophy, principles, is_default) VALUES ('v2v-operators', 'product', 'V2V Operating System', 'Vision to Value Operators', 'Product organizations exist to convert strategic vision into measurable customer value...', '[{"id":"P1","name":"End-to-End Ownership","statement":"..."},...]', true), ('user-obsessed', 'design', 'Design Operating Principles', 'User-Obsessed Craftspeople', 'Design exists to make the complex simple and the simple delightful...', '[{"id":"D1","name":"User-First Always","statement":"..."},...]', true), ('pragmatic-thinkers', 'architecture', 'Architecture Operating Principles', 'Pragmatic System Thinkers', 'Architecture exists to enable business capability through technology...', '[{"id":"A1","name":"Simplicity Over Cleverness","statement":"..."},...]', true), ('data-storytellers', 'marketing', 'Marketing Operating Principles', 'Data-Driven Storytellers', 'Marketing exists to connect product value with customer need...', '[{"id":"M1","name":"Customer Truth","statement":"..."},...]', true);

6.2 Workspace Personality Configuration (NEW)

-- Workspace-level personality overrides
CREATE TABLE workspace_personalities (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  workspace_id    UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
  team            VARCHAR(50) NOT NULL,         -- 'product', 'design', etc.
  personality_id  UUID NOT NULL REFERENCES team_personalities(id),
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),

UNIQUE(workspace_id, team) );

-- RLS ALTER TABLE workspace_personalities ENABLE ROW LEVEL SECURITY; CREATE POLICY workspace_isolation ON workspace_personalities USING (workspace_id = current_setting('app.current_workspace_id')::uuid); ALTER TABLE workspace_personalities FORCE ROW LEVEL SECURITY;

6.3 Knowledge Packs Table (Updated)

-- Knowledge packs with team attribution
CREATE TABLE knowledge_packs (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  slug            VARCHAR(50) NOT NULL UNIQUE,
  name            VARCHAR(255) NOT NULL,
  description     TEXT,
  team            VARCHAR(50),                  -- 'product', 'design', 'architecture', 'marketing', NULL=cross-team
  content         TEXT NOT NULL,                -- Full markdown content
  primary_agents  TEXT[] NOT NULL DEFAULT '{}',
  version         VARCHAR(20) NOT NULL DEFAULT '1.0.0',
  token_count     INTEGER,                      -- Pre-computed for budget enforcement
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Seed with 22 knowledge packs (9 OS + 13 Extension Teams) -- OS packs: prioritization, pricing-frameworks, discovery-methods, -- metrics-frameworks, competitive-frameworks, gtm-playbooks, -- stakeholder-management, user-research, financial-modeling -- Design packs: design-systems, user-research-methods, accessibility, interaction-patterns -- Architecture packs: api-design, data-architecture, security-patterns, cloud-native -- Marketing packs: content-strategy, seo-frameworks, analytics-methodology, -- brand-management, campaign-optimization

6.4 Agent Registry Table (NEW)

-- Unified registry of all 39 agents + 5 gateways
CREATE TABLE agent_registry (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  agent_key       VARCHAR(100) NOT NULL UNIQUE,   -- 'product-manager', 'ui-designer', etc.
  emoji           VARCHAR(10) NOT NULL,            -- Agent emoji
  display_name    VARCHAR(255) NOT NULL,           -- "Product Manager"
  short_name      VARCHAR(50) NOT NULL,            -- "PM"
  team            VARCHAR(50) NOT NULL,            -- 'product', 'design', 'architecture', 'marketing'
  agent_type      VARCHAR(20) NOT NULL             -- 'agent', 'gateway'
                    CHECK (agent_type IN ('agent', 'gateway')),
  persona_template_id UUID REFERENCES prompt_templates(id),
  knowledge_packs TEXT[] NOT NULL DEFAULT '{}',     -- Slugs of applicable knowledge packs
  primary_skills  TEXT[] NOT NULL DEFAULT '{}',     -- Skills this agent primarily uses
  domain_routing  TEXT[] NOT NULL DEFAULT '{}',     -- Keywords for auto-routing
  is_active       BOOLEAN NOT NULL DEFAULT true,
  tier_required   VARCHAR(20) NOT NULL DEFAULT 'trial'  -- 'trial', 'individual', 'team', 'enterprise'
                    CHECK (tier_required IN ('trial', 'individual', 'team', 'enterprise')),
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Indexes for routing CREATE INDEX idx_agent_team ON agent_registry(team); CREATE INDEX idx_agent_domain ON agent_registry USING GIN(domain_routing); CREATE INDEX idx_agent_type ON agent_registry(agent_type);

6.5 Interaction History Extension (Updated for v3)

The agent_invocations table from the original data model already handles interaction logging. The v3 addition is the delegation pattern tracking:

-- Add delegation tracking to agent_invocations
ALTER TABLE agent_invocations
  ADD COLUMN delegation_pattern VARCHAR(20)
    CHECK (delegation_pattern IN ('consultation', 'delegation', 'review', 'debate')),
  ADD COLUMN delegation_from VARCHAR(100),  -- Parent agent key
  ADD COLUMN delegation_deliverable TEXT;   -- What was delegated


7. API Route Architecture

7.1 Route Map

app/
├── api/
│   ├── chat/
│   │   └── route.ts              maxDuration: 60   Single agent conversation
│   ├── plt/
│   │   └── route.ts              maxDuration: 120  PLT Meeting Mode
│   ├── gateway/
│   │   └── [gateway]/
│   │       └── route.ts          maxDuration: 120  Any gateway (@product, @design, etc.)
│   ├── skill/
│   │   └── [skill]/
│   │       └── route.ts          maxDuration: 30   Skill invocation
│   ├── context/
│   │   ├── save/route.ts                           /context-save
│   │   ├── recall/route.ts                         /context-recall
│   │   ├── portfolio/route.ts                      /portfolio-status
│   │   ├── feedback/
│   │   │   ├── capture/route.ts                    /feedback-capture
│   │   │   └── recall/route.ts                     /feedback-recall
│   │   └── graph/route.ts                          Cross-reference queries
│   ├── workspace/
│   │   ├── route.ts                                CRUD workspaces
│   │   ├── connect/
│   │   │   └── [provider]/route.ts                 OAuth flows (Google Drive, etc.)
│   │   └── files/
│   │       └── route.ts                            File browser
│   ├── agents/
│   │   └── route.ts                                List available agents
│   ├── keys/
│   │   └── route.ts                                BYOT key management
│   ├── usage/
│   │   └── route.ts                                Usage dashboard
│   └── webhooks/
│       ├── clerk/route.ts                          User sync
│       └── stripe/route.ts                         Billing events

7.2 Middleware Stack

// middleware.ts
import { clerkMiddleware } from '@clerk/nextjs/server';
import { NextResponse } from 'next/server';

export default clerkMiddleware(async (auth, req) => { // 1. Rate limiting (in-memory + Redis fallback) const rateLimitResult = await checkRateLimit(req); if (!rateLimitResult.allowed) { return NextResponse.json( { error: { code: 'rate_limited', message: 'Rate limit exceeded' } }, { status: 429, headers: { 'Retry-After': rateLimitResult.retryAfter } } ); }

// 2. Workspace context injection const workspaceId = req.headers.get('X-Workspace-ID'); if (workspaceId && req.nextUrl.pathname.startsWith('/api/')) { // Set PostgreSQL RLS context await setWorkspaceContext(db, workspaceId); }

// 3. Distributed tracing const requestId = req.headers.get('X-Request-ID') || crypto.randomUUID(); const response = NextResponse.next(); response.headers.set('X-Request-ID', requestId);

return response; });

7.3 SSE Event Contract

type SSEEvent =
  | { type: 'token'; data: { text: string } }
  | { type: 'tool_call_start'; data: { tool: string; id: string } }
  | { type: 'tool_call_result'; data: { id: string; result: unknown } }
  | { type: 'agent_start'; data: { agent: string; emoji: string; display_name: string } }
  | { type: 'agent_complete'; data: { agent: string; roi_minutes: number } }
  | { type: 'file_created'; data: { file_id: string; path: string; action: string } }
  | { type: 'context_saved'; data: { type: string; id: string } }
  | { type: 'error'; data: { code: string; message: string; recoverable: boolean } }
  | { type: 'done'; data: { tokens_in: number; tokens_out: number; duration_ms: number } };


8. Team Personalities Infrastructure

8.1 Design Decisions

AspectDecisionRationale
Phase 1 UXNone — Infrastructure onlyReduce MVP scope, no settings UI needed
Data modelSeparate team_personalities tableClean separation, swappable at query time
Prompt injectionAppended to Layer 2 (Agent Persona)Minimal token overhead (~100 tokens)
DefaultsOne default personality per teamWorks out of the box
SwappingAPI layer supports it; UI deferredInfrastructure ready for Phase 2 UX

8.2 How Personalities Flow Through the System

Agent Spawn Request
       │
       ▼
┌──────────────────────────────────────────────────────┐
│  1. Look up agent in agent_registry                  │
│     → Get team: 'product'                            │
│                                                      │
│  2. Look up workspace personality override            │
│     workspace_personalities WHERE team = 'product'   │
│     → If found: use override personality_id          │
│     → If not: use default from team_personalities    │
│                                                      │
│  3. Load personality principles                       │
│     team_personalities WHERE id = personality_id     │
│     → Get principles JSON array                      │
│                                                      │
│  4. Inject into Layer 2 of system prompt             │
│     Append to agent persona:                         │
│     "## Team Operating Principles                    │
│      Personality: Vision to Value Operators           │
│      P1: End-to-End Ownership — ...                  │
│      P2: Decision Quality — ..."                     │
└──────────────────────────────────────────────────────┘

8.3 API Endpoints (Infrastructure, No UX)

// Phase 1: Read-only API for personalities
// app/api/personalities/route.ts
export async function GET(req: Request) {
  // List all available personalities
  const personalities = await db.select()
    .from(schema.teamPersonalities)
    .orderBy(schema.teamPersonalities.team);
  return Response.json({ data: personalities });
}

// app/api/workspace/[id]/personality/route.ts export async function GET(req: Request) { // Get current workspace personality config const config = await db.select() .from(schema.workspacePersonalities) .where(eq(schema.workspacePersonalities.workspaceId, workspaceId)); return Response.json({ data: config }); }

export async function PUT(req: Request) { // Set workspace personality (for future UI) const { team, personalityId } = await req.json(); await db.insert(schema.workspacePersonalities) .values({ workspaceId, team, personalityId }) .onConflictDoUpdate({ target: [schema.workspacePersonalities.workspaceId, schema.workspacePersonalities.team] }); return Response.json({ data: { success: true } }); }

8.4 Prompt Injection Format

When a team personality is active, it adds ~100 tokens to the agent persona (Layer 2):

```markdown

Team Operating Principles

Personality: Vision to Value Operators

  • End-to-End Ownership — One person accountable from vision to value
  • Decision Quality — Structured decisions under pressure
  • Customer Obsession — Start with customer, trace back to feature
  • Strategic Clarity — Clear bets with explicit assumptions
  • Outcome Focus — Measure outcomes, not outputs
  • Collaborative Excellence — Right people, right input, right time
  • Continuous Learning — Every outcome teaches something
  • Scalable Systems — Processes that grow with the org
  • ```


    9. Knowledge Pack Loading Strategy

    9.1 Loading Rules

    Knowledge packs are loaded on-demand based on the agent and task:

    // lib/knowledge-loader.ts
    export async function loadKnowledgePacks(
      agentKey: string,
      detectedSkill?: string
    ): Promise {
      // Get agent's primary knowledge packs
      const agent = await db.select()
        .from(schema.agentRegistry)
        .where(eq(schema.agentRegistry.agentKey, agentKey))
        .limit(1);

    const packSlugs = agent[0]?.knowledgePacks || [];

    // Load packs from DB or R2 cache const packs = await db.select() .from(schema.knowledgePacks) .where(inArray(schema.knowledgePacks.slug, packSlugs));

    // Budget enforcement: max 2 packs per invocation, prioritize by relevance const sorted = rankByRelevance(packs, detectedSkill); return sorted.slice(0, 2).map(p => p.content); }

    9.2 Pack Inventory

    PackTeamPrimary AgentsEst. Tokens
    prioritizationproduct@pm, @pm-dir~1,200
    pricing-frameworksproduct@bizops, @vp-product~1,100
    discovery-methodsproduct@pm, @ux-lead~1,000
    metrics-frameworksproduct@bizops, @value-realization~1,100
    competitive-frameworksproduct@ci, @pmm-dir~1,000
    gtm-playbooksproduct@pmm, @pmm-dir~1,200
    stakeholder-managementproduct@pm-dir, @prod-ops~900
    user-researchproduct@ux-lead, @pm~1,100
    financial-modelingproduct@bizops, @bizdev~1,000
    design-systemsdesign@ui-designer, @visual-designer~1,000
    user-research-methodsdesign@user-researcher~1,100
    accessibilitydesign@ui-designer~800
    interaction-patternsdesign@interaction-designer~900
    api-designarchitecture@api-architect~1,000
    data-architecturearchitecture@data-architect~1,100
    security-patternsarchitecture@security-architect~900
    cloud-nativearchitecture@cloud-architect~1,000
    content-strategymarketing@content-strategist~1,000
    seo-frameworksmarketing@seo-specialist~900
    analytics-methodologymarketing@analytics-specialist~1,000
    brand-managementmarketing@brand-strategist~900
    campaign-optimizationmarketing@paid-media, @email-marketing~1,000

    Knowledge packs are NOT included in the cached system prompt (they'd break the cache key). Instead, they're loaded into the conversation context when the agent's task requires framework application.


    10. Cloud Storage Abstraction

    10.1 Provider Abstraction Layer

    // lib/cloud-storage/interface.ts
    export interface CloudStorageProvider {
      readFile(fileId: string): Promise;
      writeFile(path: string, content: string, parentFolderId: string): Promise;
      updateFile(fileId: string, content: string): Promise;
      deleteFile(fileId: string): Promise;
      listFiles(folderId: string, query?: string): Promise;
      getMetadata(fileId: string): Promise;
      resolvePathToId(path: string, rootFolderId: string): Promise;
      createFolder(name: string, parentId: string): Promise;
    }

    // lib/cloud-storage/google-drive.ts export class GoogleDriveProvider implements CloudStorageProvider { constructor(private accessToken: string) {}

    async readFile(fileId: string): Promise { const response = await fetch( https://www.googleapis.com/drive/v3/files/${fileId}?alt=media, { headers: { Authorization: Bearer ${this.accessToken} } } ); return response.text(); }

    async writeFile(path: string, content: string, parentFolderId: string): Promise { const metadata = { name: path.split('/').pop(), parents: [parentFolderId], mimeType: 'text/markdown', }; // Multipart upload const form = new FormData(); form.append('metadata', new Blob([JSON.stringify(metadata)], { type: 'application/json' })); form.append('file', new Blob([content], { type: 'text/markdown' }));

    const response = await fetch( 'https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart', { method: 'POST', headers: { Authorization: Bearer ${this.accessToken} }, body: form } ); const result = await response.json(); return result.id; }

    // ... other methods }

    10.2 Token Management

    OAuth tokens are encrypted with envelope encryption and auto-refreshed:

    // lib/cloud-storage/token-manager.ts
    export async function getProviderToken(
      workspaceId: string,
      provider: string
    ): Promise {
      const integration = await db.select()
        .from(schema.connectedIntegrations)
        .where(and(
          eq(schema.connectedIntegrations.workspaceId, workspaceId),
          eq(schema.connectedIntegrations.provider, provider),
        ))
        .limit(1);

    if (!integration[0]) throw new Error(No ${provider} connection);

    const token = await db.select() .from(schema.integrationTokens) .where(eq(schema.integrationTokens.integrationId, integration[0].id)) .limit(1);

    // Decrypt access token let accessToken = await decrypt(token[0].encryptedAccessToken, token[0].encryptedDek);

    // Check expiry and refresh if needed if (token[0].expiresAt && new Date(token[0].expiresAt) < new Date()) { const refreshToken = await decrypt(token[0].encryptedRefreshToken, token[0].encryptedDek); accessToken = await refreshOAuthToken(provider, refreshToken); await updateEncryptedToken(token[0].id, accessToken); }

    return accessToken; }

    10.3 Workspace File Structure

    Files in the user's cloud storage follow the OS context structure:

    User's Google Drive/
    └── Legionis Workspace/           ← User-selected folder
        ├── context/
        │   ├── decisions/                 ← DR-YYYY-NNN.md files
        │   ├── bets/                      ← SB-YYYY-NNN.md files
        │   ├── feedback/                  ← FB-YYYY-NNN.md files
        │   ├── learnings/                 ← L-NNN entries
        │   ├── portfolio/                 ← Active bets tracking
        │   └── documents/                 ← Auto-registered deliverables
        ├── deliverables/                  ← PRDs, roadmaps, analyses
        └── .workspace.json                ← Workspace metadata
    


    11. Caching Strategy

    11.1 Multi-Level Cache Architecture

    ┌─────────────────────────────────────────────────────────────────┐
    │                    CACHE HIERARCHY                               │
    │                                                                  │
    │  Level 1: Anthropic API Prompt Cache (5 min TTL)                │
    │  ─ System prompt layers cached by Anthropic                      │
    │  ─ 90% cost reduction on cache hits                              │
    │  ─ Requires exact prefix match                                   │
    │                                                                  │
    │  Level 2: In-Memory Compiled Prompts (per Vercel instance)      │
    │  ─ Compiled core protocol and agent personas                     │
    │  ─ Refreshed on version update or cold start                     │
    │  ─ Near-zero latency                                             │
    │                                                                  │
    │  Level 3: R2 Object Cache (persistent)                           │
    │  ─ Compiled prompt JSON                                          │
    │  ─ Knowledge pack content                                        │
    │  ─ Updated via build pipeline (compile-prompts.ts)               │
    │                                                                  │
    │  Level 4: PostgreSQL (source of truth)                           │
    │  ─ prompt_templates table                                        │
    │  ─ knowledge_packs table                                         │
    │  ─ team_personalities table                                      │
    └─────────────────────────────────────────────────────────────────┘
    

    11.2 Cache Invalidation Strategy

    TriggerAction
    OS version updateRe-run compile-prompts.ts, update R2, invalidate L2
    Agent persona edit (A/B test)Update prompt_templates, invalidate L2 for that agent
    Team personality changeUpdate workspace_personalities, invalidate L2 for team agents
    Knowledge pack updateUpdate knowledge_packs, no prompt cache impact (not in system prompt)


    12. MCP Integration Approach for SaaS

    12.1 CLI vs Cloud Integration Model

    AspectCLI (MCP)Cloud (OAuth)
    ProtocolMCP (stdio/SSE)OAuth 2.0 + REST API
    AuthenticationAPI keys in env varsEncrypted tokens in DB
    AvailabilityUser configures locallyUser connects via OAuth flow
    Tool registration.mcp.json config fileconnected_integrations table
    Runtime detectionCheck available tool listQuery integrations for workspace

    12.2 Supported Integrations (Cloud)

    IntegrationAPIOAuth ScopesAgent Use Cases
    Google DriveDrive API v3drive.fileFile read/write (primary storage)
    JiraJira REST API v3read:jira-work, write:jira-workCreate issues from user stories
    SlackSlack Web APIchat:write, channels:readPost updates, share decisions
    GitHubGitHub REST/GraphQLrepo, issuesLink commits to features
    LinearLinear APIissues:read, issues:writeProject management sync

    12.3 Graceful Degradation

    Agents detect available integrations at runtime. If a tool is not connected, agents produce text output with actionable "Next Steps (Manual)" sections, exactly as the MCP integration framework specifies.


    13. Security Architecture

    13.1 Security Layers

    ┌─────────────────────────────────────────────────────────┐
    │  Layer 1: Authentication (Clerk)                         │
    │  ─ JWT validation on every request                       │
    │  ─ Session management with refresh                       │
    │  ─ Social login + email/password                         │
    ├─────────────────────────────────────────────────────────┤
    │  Layer 2: Authorization (RLS + Middleware)                │
    │  ─ PostgreSQL RLS on all workspace-scoped tables         │
    │  ─ Middleware sets workspace context per request          │
    │  ─ Tier-based feature gating                             │
    ├─────────────────────────────────────────────────────────┤
    │  Layer 3: Data Encryption                                │
    │  ─ API keys: AES-256-GCM + envelope encryption (KMS)    │
    │  ─ OAuth tokens: Same envelope encryption                │
    │  ─ Data at rest: Neon TLS + encryption                   │
    │  ─ Data in transit: TLS 1.3 everywhere                   │
    ├─────────────────────────────────────────────────────────┤
    │  Layer 4: Prompt Security                                │
    │  ─ Injection detection (pattern matching)                │
    │  ─ User content sandboxing (XML boundary markers)        │
    │  ─ Output validation (system prompt leak detection)      │
    │  ─ Rate limiting on suspected injection probing          │
    ├─────────────────────────────────────────────────────────┤
    │  Layer 5: Audit & Monitoring                             │
    │  ─ All agent invocations logged with trace context       │
    │  ─ All file operations logged                            │
    │  ─ Security events in Sentry                             │
    │  ─ Rate limit violations tracked                         │
    └─────────────────────────────────────────────────────────┘
    

    13.2 No Bash in Cloud

    The CLI's Bash tool is removed in cloud mode. Agents cannot execute arbitrary shell commands. All file operations go through the cloud storage abstraction layer, which enforces:


    14. Agent Roster (81 Agents + 11 Gateways)

    14.1 Product Org OS Agents (13)

    Agent KeyEmojiDisplay NameTeam
    product-manager📝Product Managerproduct
    cpo👑Chief Product Officerproduct
    vp-product📈VP Productproduct
    director-product-management📋Director of Product Managementproduct
    director-product-marketing📣Director of Product Marketingproduct
    product-marketing-manager🎯Product Marketing Managerproduct
    product-mentor🎓Product Mentorproduct
    bizops🧮BizOpsproduct
    bizdev🤝Business Developmentproduct
    competitive-intelligence🔭Competitive Intelligenceproduct
    product-operations⚙️Product Operationsproduct
    ux-lead🎨UX Leadproduct
    value-realization💰Value Realizationproduct

    14.2 Extension Team Agents (26)

    Design Team (6):

    Agent KeyEmojiDisplay Name
    design-dir🎨Director of Design
    ui-designer🖼️UI Designer
    visual-designer🎨Visual Designer
    interaction-designer🔄Interaction Designer
    user-researcher🔍User Researcher
    motion-designer🎬Motion Designer

    Architecture Team (6):

    Agent KeyEmojiDisplay Name
    architecture-dir🏗️Chief Architect
    api-architect🔌API Architect
    data-architect🗄️Data Architect
    security-architect🔒Security Architect
    cloud-architect☁️Cloud Architect
    ai-architect🧠AI/ML Architect

    Marketing Team (14):

    Agent KeyEmojiDisplay Name
    marketing-dir📢Director of Marketing
    content-strategist✍️Content Strategist
    copywriter📄Copywriter
    seo-specialist🔍SEO Specialist
    cro-specialist📊CRO Specialist
    paid-media💰Paid Media Specialist
    email-marketing📧Email Marketing Specialist
    social-media📱Social Media Manager
    growth-hacker🚀Growth Hacker
    market-researcher📈Market Researcher
    video-producer🎥Video Producer
    pr-specialist📰PR Specialist
    brand-strategist🏷️Brand Strategist
    analytics-specialist📊Analytics Specialist

    14.3 Gateways (5)

    Gateway KeyEmojiDisplay NameBehavior
    product🏛️Product GatewayRoutes to relevant owners, orchestrates execution
    product-leadership-team👥PLTMeeting Mode with multiple leadership perspectives
    design🎨Design GatewayRoutes to design specialists
    architecture🏗️Architecture GatewayRoutes to architecture specialists
    marketing📢Marketing GatewayRoutes to marketing specialists


    15. Distributed Tracing

    15.1 Trace Structure

    Request ID: req_abc123
    │
    ├── span_001: PLT Gateway (gateway:plt)
    │   ├── span_002: VP Product (agent:vp-product)
    │   │   └── span_003: BizOps (sub-agent, consultation)
    │   ├── span_004: Dir PM (agent:director-product-management)
    │   ├── span_005: Dir PMM (agent:director-product-marketing)
    │   └── span_006: ProdOps (agent:product-operations)
    │
    └── Post-processing: ROI calculation, interaction logging
    

    15.2 Implementation

    Every request gets a trace context that propagates through sub-agent spawns:

    // lib/tracing.ts
    export interface TraceContext {
      requestId: string;
      spanId: string;
      parentSpanId?: string;
      userId: string;
      workspaceId: string;
      operation: string;
      startedAt: string;
      depth: number;
    }

    export function createTraceContext(req: Request, operation: string): TraceContext { return { requestId: req.headers.get('X-Request-ID') || crypto.randomUUID(), spanId: crypto.randomUUID(), parentSpanId: req.headers.get('X-Parent-Span-ID'), userId: auth().userId!, workspaceId: req.headers.get('X-Workspace-ID')!, operation, startedAt: new Date().toISOString(), depth: 0, }; }

    export function childSpan(parent: TraceContext, operation: string): TraceContext { return { ...parent, spanId: crypto.randomUUID(), parentSpanId: parent.spanId, operation, startedAt: new Date().toISOString(), depth: parent.depth + 1, }; }


    16. Vercel Deployment Configuration

    16.1 vercel.json

    {
      "$schema": "https://openapi.vercel.sh/vercel.json",
      "fluid": true,
      "regions": ["iad1"],
      "functions": {
        "app/api/plt/**": { "maxDuration": 120 },
        "app/api/gateway/**": { "maxDuration": 120 },
        "app/api/chat/**": { "maxDuration": 60 },
        "app/api/skill/**": { "maxDuration": 30 }
      }
    }
    

    16.2 Environment Variables

    Database

    DATABASE_URL=postgresql://...@ep-xxx.us-east-2.aws.neon.tech/neondb?sslmode=require

    Auth

    NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_live_... CLERK_SECRET_KEY=sk_live_...

    Billing

    STRIPE_SECRET_KEY=sk_live_... STRIPE_WEBHOOK_SECRET=whsec_...

    Storage

    R2_ACCOUNT_ID=... R2_ACCESS_KEY_ID=... R2_SECRET_ACCESS_KEY=... R2_BUCKET_NAME=project-saas-prompts

    Search

    TYPESENSE_API_KEY=... TYPESENSE_HOST=xxx.typesense.net

    Monitoring

    SENTRY_DSN=https://...@sentry.io/... NEXT_PUBLIC_POSTHOG_KEY=phc_... BETTERSTACK_SOURCE_TOKEN=...

    Encryption

    KMS_KEY_ARN=arn:aws:kms:us-east-1:...

    Cloud Storage OAuth

    GOOGLE_CLIENT_ID=... GOOGLE_CLIENT_SECRET=...


    17. Entity Relationship Diagram (Updated for v3)

    users 1─────M workspace_memberships M──────1 workspaces
      │                                             │
      │                                    ┌────────┼─────────────────────────┐
      │                                    │        │                         │
      └──M user_api_keys           decisions    strategic_bets    feedback    │
                                       │            │              │          │
                                       │            │              │          │
                                ┌──────┼────────────┼──────────────┼──────┐  │
                                │                                          │  │
                          cross_references                          feedback_themes
                                │                                          │
                        assumptions    learnings    documents        feedback_theme_links
                                │
                          portfolio_state

    workspaces ──M connected_integrations ──1 integration_tokens workspaces ──M conversations ──M messages workspaces ──M agent_invocations workspaces ──M usage_events workspaces ──M roi_sessions workspaces ──M workspace_personalities ──1 team_personalities (NEW)

    agent_registry (global) ──1 prompt_templates (global) knowledge_packs (global) team_personalities (global)


    18. Performance Targets

    MetricTargetMeasurement
    Single agent response (p50)<10sTime to last token
    Single agent response (p95)<30sTime to last token
    PLT session (p50)<30sTime to formatted response
    PLT session (p95)<60sTime to formatted response
    Skill invocation (p50)<5sTime to completion
    Context recall query<500msDatabase query + format
    Auto-context injection<200msTopic extraction + query
    Time to first token (streaming)<3sFirst SSE event
    Cache hit rate (Anthropic)>80%Provider metadata tracking


    19. MVP Scope Checklist

    FeatureStatusNotes
    61 skillsMVPAll skills available
    39 agents (13 OS + 26 Extension)MVPFull roster
    5 gatewaysMVP@product, @plt, @design, @architecture, @marketing
    PLT Meeting ModeMVPParallel agents + synthesis
    Delegation Protocol (4 patterns)MVPConsultation, Delegation, Review, Debate
    Context layer (PostgreSQL)MVPAll context tables
    Cross-reference graphMVPRelational implementation
    Auto-context injectionMVPTopic-based, database-backed
    Team Personalities (infrastructure)MVPData model + API, no UX
    Knowledge packs (22)MVPLoaded from DB/R2
    Google Drive integrationMVPOAuth + file tools
    BYOT (Claude + OpenAI)MVPPer-request key routing
    System prompt cachingMVP3-layer with Anthropic cache
    Interaction loggingMVPagent_invocations table
    Distributed tracingMVPRequest ID propagation
    Prompt versioningMVPDatabase + feature flags
    SSE streamingMVPToken-by-token + agent events
    $10/mo individual, $8/seat team pricingMVPStripe integration (1-month trial, no free tier)
    OneDrive/DropboxGrowthPhase 2
    Team collaborationEnterprisePhase 3
    Hybrid CLI/Cloud syncEnterprisePhase 3


    20. Risk Matrix

    RiskLikelihoodImpactMitigation
    PLT exceeds 60s p95MediumLowCap maxSteps=5; 300s Fluid Compute buffer
    Prompt compression degrades qualityMediumMediumA/B test; keep originals; tune iteratively
    Google OAuth verification delayedMediumHighApply early; use test mode for beta
    BYOT key abuse (shared keys)LowMediumRate limit per key; abuse detection
    Cloud storage API latency spikesLowMediumPer-tool timeouts; circuit breaker
    Cache hit rate below 70%LowMediumMonitor; extend to 1-hour TTL
    Multi-agent token costs surprise usersMediumLowCost estimator in UI; model routing


    Document Status: Active (v3.1 — definitive architecture + conceptual platform model) Last Updated: 2026-02-18 Gate Owner: Chief Architect Next Review: Pre-development kickoff

    v3.1 Change Log: Added Section 0 (Conceptual Platform Architecture) with 4-layer model, defensibility assessment, compounding flywheel, and 3 external connections. Maps conceptual layers to implementation sections. Sourced from Platform Architecture Deck (23-slide presentation, Feb 2026).