Product Requirements Document: Legionis

Your Full AI Workforce Platform

Document Version: 3.0 Date: 2026-02-17 Owner: Director of Product Management Status: Active Product: Legionis V2V Phase: Phase 3 — Strategic Commitments Related Decisions: DR-2026-001 (Full-Capability Day 1), DR-2026-002 (Cloud Storage First), DR-2026-003 (Naming — "Legionis"), Pricing Model (Team-Based Modules) Architecture Reference: See architecture-stack.md for verified pricing and technical specifications

1. Product Overview

1.1 Vision

Legionis is a general-purpose AI workforce platform — teams of autonomous AI agents that augment human work across every business function. Where other AI tools give you a single chatbot, Legionis gives you a full organizational workforce: product managers, marketers, architects, finance directors, legal counsel, and executives — 81 agents across 11 teams, each with distinct expertise, working together through structured collaboration.

One-line vision: "Your full AI workforce. Many acting as one."

Competitive Moat: 6 Core Differentiators

Legionis is defensible through six interlocking differentiators that form a flywheel: users start with one team, agents collaborate to produce quality work, deliverables save to the user's cloud, memory compounds, trust deepens because users own everything, and they add more teams.

#	Differentiator	What It Means	Why It Matters
1	Cloud Storage Connectivity	Connect to Google Drive (OneDrive, Dropbox planned). Deliverables save to user's folder structure. File IDs survive renames and reorganization.	Data ownership from day one. Work persists in the user's own cloud, not trapped in a SaaS black box. Stable file IDs enable compounding memory.
2	Bring Your Own Tokens (BYOT)	Users connect their own OpenAI, Anthropic, or Google API keys. Pay providers directly at their rates.	Zero markup. Full cost transparency. No vendor lock-in on the AI layer. Users control model selection and spend.
3	Modular Agent Provisioning	Users acquire teams by department. Start with one, expand as needed. Each team brings a lead, specialists, knowledge packs, and a coach.	No bloatware. Pay for what you use. Organic expansion as the user discovers more use cases.
4	Collaborative Agent Architecture	Agents don't just respond to the user. They consult each other, delegate sub-tasks to specialists, review each other's work, and debate genuine tradeoffs with structured arguments. Agents spawn other agents.	Multi-perspective quality that no single-agent chatbot can match. The user gets synthesis, not raw output.
5	Organizational Memory (Context Layer)	Decisions, interactions, assumptions, learnings, feedback, and document indexes are saved as markdown files in the user's own cloud storage. Cross-referenced and instantly recallable.	Memory compounds over time. The 100th decision is informed by the first 99. The user owns this data as portable files, not locked in a proprietary database.
6	Full Data Ownership	Tokens stay with the provider. Files stay in the user's cloud. Context stays in the user's workspace. Nothing passes through or is stored on Legionis servers.	Trust at the architecture level. Users can verify ownership, export anytime, and switch providers without losing their organizational memory.

These six differentiators reinforce each other: cloud storage enables persistent memory, BYOT ensures cost control, modular provisioning drives expansion, collaborative architecture produces quality work worth remembering, and full data ownership builds the trust required for long-term adoption.

1.2 Mission

Deliver a complete AI workforce experience — 81 agents across 11 specialized teams, 9 team gateways, and 80+ skills — through a web interface that feels like commanding a real organization. Text input, voice activation, agent selection, structured team meetings — all backed by cloud storage, persistent memory, and multi-perspective collaboration. All capabilities available from Day 1.

1.3 Target Users

Primary Target: Professionals who need a full team but don't have one — or need to augment the one they have.

Persona	Description	Pain Point	Why Legionis
Power PM (Alex)	Senior PM who uses Claude Code or similar AI tools	Terminal context is messy, files scattered, no persistence across sessions	Web UI with cloud storage = organized workspace + persistent context. Full product team on demand.
Technical VP (Jordan)	VP Product with engineering background, comfortable with command-line	Single AI tools lack multi-perspective input; no PLT simulation	Full AI workforce with 81 agents, Meeting Mode, cross-functional team gateways
Product Consultant (Sam)	Fractional CPO serving multiple clients	Each client is a separate context nightmare, can't share work easily	Multi-workspace, client-by-client context isolation, export capabilities
Startup Founder (Riley)	Wears many hats — product, marketing, finance, ops	Can't afford to hire specialists for every function; needs a full team without headcount	11 teams covering every business function: @marketing for campaigns, @finance for budgets, @legal for contracts, @operations for process
Marketing Lead (Morgan)	Runs content, campaigns, positioning for a growing company	Content creation is a bottleneck; needs strategic marketing support, not just copywriting	@marketing gateway with 16 specialists: content strategist, SEO, CRO, paid media, email, social, PR, growth
Finance Director (Pat)	Manages budgets, forecasting, investor materials	Financial modeling takes weeks; investor decks are always last-minute	@finance gateway with 9 specialists: FP&A, revenue analysis, investor relations, tax planning
Operations Manager (Casey)	Owns process improvement, vendor management, program coordination	Operational gaps between teams; processes undocumented	@operations gateway with 8 specialists: program management, procurement, process engineering, risk

Secondary Target (lower priority for initial launch):

Persona	Description	Why Later
Legal Teams	In-house counsel needing contract review, compliance, privacy	Requires domain-specific validation; @legal team is ready but trust-building needed
Executive Assistants	Supporting C-suite with strategy prep, board materials	Requires @executive team + cross-team orchestration polish
Corporate Development	M&A analysis, partnership evaluation, venture scouting	Niche use case; @corpdev team is ready for early adopters
Non-Technical PM	PM who's never used CLI	Requires more onboarding polish, guided workflows
Enterprise Team	Large product org with compliance needs	Requires SSO, audit logs, admin controls

1.4 Problem Statement

The problem is NOT "CLI is too hard." The real problems are:

Context Fragmentation: Product work happens across docs, spreadsheets, Notion, Confluence, local files. There's no unified workspace that accumulates organizational knowledge.

Ephemeral Sessions: Current AI tools (ChatGPT, Claude.ai) don't persist context. Every conversation starts from scratch. Product decisions made 3 months ago are forgotten.

No Stable References: When you move or rename a file, every reference to it breaks. Cloud storage APIs provide stable file IDs that survive reorganization — this is the foundation for compounding memory.

Single-Perspective AI: ChatGPT gives you one voice. Business decisions need multiple perspectives — PM, VP, PMM, BizOps, Marketing, Finance, Legal — synthesized together. That's what agent orchestration provides.

Functional Gaps: Small teams and founders can't afford specialists in every discipline. They need marketing strategy, financial modeling, legal review, and architecture guidance — but hiring 81 people isn't an option.

Legionis solves these problems with a web-based workspace that:

Integrates with your cloud storage (Google Drive, OneDrive, Dropbox) for stable file references
Persists organizational context (decisions, bets, feedback, learnings) across sessions
Provides 81 agent personas across 11 teams with distinct perspectives via Meeting Mode
Delivers 80+ specialized skills through a familiar text-input interface
Covers every business function: product, marketing, finance, legal, operations, architecture, design, executive, corporate development, and IT governance

1.5 Strategic Bet

We believe professionals will pay $10-25/user/mo for an AI workforce platform that:

Covers every business function with specialized agents (not a generic chatbot)
Connects to their existing cloud storage (not another silo)
Accumulates organizational context that compounds over time
Provides multi-perspective AI through team-based agent orchestration

The moat is context accumulation. After 3+ months of decisions, bets, feedback, and learnings stored in the user's cloud drive with stable file IDs, switching to another tool means losing all that institutional knowledge.

Pricing (source of truth: Legionis Agent Catalog):

Tier	Price	What's Included
Free	$0	500 ops/month, platform-provided Haiku, all agents accessible
Pro	$10/user/mo	Full model access via BYOT (Bring Your Own Tokens), all 11 teams
Team	$7/user/mo	Team-based pricing for organizations, shared workspaces
Add-on Module	+$5/user/mo	Premium extension team modules (individual team add-ons)
Full Org	$25/user/mo	Everything: all teams, all modules, priority support

Assumptions:

#	Assumption	Validation Method	Timeline
A-001	Power users (familiar with AI tools) convert at >5% Free-to-Pro	Measure conversion rate by user segment	6 months post-launch
A-002	Context accumulation creates measurable switching costs after 3+ months	Track retention by tenure cohort	9 months post-launch
A-003	ROI tracking drives Free-to-Pro conversion	A/B test ROI visibility vs. hidden	3 months post-launch
A-004	Agent Orchestrator performs without quality/latency degradation	Load testing	Launch + 1 month
A-005	"Your API Keys" model is preferred by target users (they have existing API contracts)	User interviews, signup funnel analysis	3 months post-launch
A-006	Cloud storage integration (vs. built-in storage) is preferred by target users	User interviews, signup funnel analysis	3 months post-launch
A-007	Voice input is used by >20% of power users	Feature flag usage tracking	6 months post-launch
A-008	Non-product personas (marketing, finance, ops) expand addressable market by 3x+	Signup persona tracking, team selection data	6 months post-launch
A-009	Team-based pricing ($7/user) drives org adoption over individual ($10/user)	Conversion funnel analysis per tier	6 months post-launch

We'll know we're wrong when:

Free-to-Pro conversion <3% after 6 months
Monthly churn >8% in Pro tier
Users request platform-provided API tokens instead of using their own keys
Agent Orchestrator latency >15s consistently
Users request built-in storage instead of cloud integrations
Voice input usage <5%
>80% of users only use Product Org agents (extension teams not valued)
Team tier adoption <10% of paid users (org pricing not compelling)

2. Technology Choices

This section defines the technology stack and architecture decisions for Legionis. Each choice includes rationale and alternatives considered.

2.1 Frontend

Framework: React 18+ with TypeScript

Choice: React 18 with TypeScript (strict mode)

Rationale:

Largest ecosystem for component libraries, tooling, and hiring
TypeScript provides compile-time safety critical for a complex multi-panel UI
React 18 concurrent features enable smooth streaming response rendering
Server Components (via Next.js) enable fast initial page loads
Mature SSR/SSG story reduces time-to-interactive

Alternatives Considered:

Vue 3: Smaller ecosystem, fewer component libraries for our use case
SolidJS: Performance advantages offset by smaller talent pool and ecosystem
Svelte/SvelteKit: Great DX but less mature for enterprise-scale apps

Meta-Framework: Next.js 16 (App Router)

Choice: Next.js 16 with App Router

Rationale:

Server Components for non-interactive UI (settings, file browser) reduce bundle size
API Routes serve as the unified backend — no separate backend service needed
Built-in Image optimization, font loading, and metadata management
Vercel deployment is zero-config and includes edge functions
App Router provides nested layouts ideal for multi-panel workspace UI
Next.js 16 provides improved performance, React 19 integration, and enhanced streaming support

AI SDK: Vercel AI SDK v6

Choice: Vercel AI SDK v6 (ai package) for all LLM interactions

Rationale:

Unified interface for multiple LLM providers (Anthropic, OpenAI, Google) — critical for BYOT model
Built-in streaming with streamText() and generateText() — handles SSE natively
Tool definitions with Zod schema validation — type-safe tool use
useChat() React hook for frontend streaming integration
Provider-agnostic: swap models without changing application code
Built for Next.js API routes — no separate backend needed

Key APIs:

streamText() — Streaming agent responses to the client
generateText() — Non-streaming operations (context recall, document generation)
tool() — Define tools with Zod schemas for file operations, search, agent spawning
useChat() — Client-side hook for chat UI with automatic streaming
toUIMessageStreamResponse() — Convert stream to response format for useChat

Styling: Tailwind CSS 4

Choice: Tailwind CSS with the @tailwindcss/typography plugin

Rationale:

Utility-first approach prevents style conflicts in complex multi-panel layouts
Typography plugin provides beautiful markdown rendering out of the box
JIT compilation keeps bundle size minimal
Strong ecosystem: Headless UI, Radix UI primitives pair natively
Consistent design tokens via tailwind.config

Component Library: Radix UI Primitives + Custom Components

Choice: Radix UI (headless) + custom-styled components

Rationale:

Headless primitives (Dialog, Dropdown, Tabs, Tooltip) provide accessibility out of the box
Full control over visual design (no fighting a design system)
Smaller bundle than full component libraries (Ant Design, MUI)
Radix + Tailwind is a proven, well-documented combination

Key Components:

@radix-ui/react-dialog — Modals, sheets
@radix-ui/react-dropdown-menu — Context menus, action menus
@radix-ui/react-tabs — Panel switching
@radix-ui/react-tooltip — Skill descriptions, hints
@radix-ui/react-scroll-area — Chat scroll, file explorer

State Management: Zustand + TanStack Query

Choice: Zustand for client state, TanStack Query (React Query) for server state

Rationale:

Zustand: Minimal boilerplate, no provider wrapping, TypeScript-native. Ideal for UI state (active panel, selected file, editor state, theme preferences)
TanStack Query: Handles server data fetching, caching, invalidation, optimistic updates. Ideal for workspace data, conversation history, file metadata
Clear separation: Zustand never fetches data, TanStack Query never stores UI state
Both are lightweight (~2KB + ~13KB gzipped)

Alternatives Considered:

Redux Toolkit: Heavier, more boilerplate, overkill for this app size
Jotai/Recoil: Atomic state is elegant but adds cognitive overhead for team onboarding
Context API alone: Causes unnecessary re-renders at scale

Markdown Rendering: react-markdown + rehype/remark plugins

Choice: react-markdown with remark-gfm, rehype-highlight, rehype-raw

Rationale:

GFM (GitHub Flavored Markdown) support for tables, task lists, strikethrough
Syntax highlighting for code blocks (skill outputs often include code)
Custom component overrides enable styled rendering matching our design system
Raw HTML support needed for presentation/HTML outputs from skills

Additional rendering:

Mermaid diagrams: mermaid library for flowcharts in roadmaps and architecture docs
Presentations: reveal.js embedded for HTML presentation viewing
PDF export: @react-pdf/renderer for downloadable document exports

Rich Editor: Tiptap

Choice: Tiptap (based on ProseMirror)

Rationale:

Headless editor with full control over UI
Markdown serialization/deserialization built-in
Collaborative editing support (for future Team tier)
Extension architecture for custom nodes (skill embeds, agent mentions)
Used by major products (GitLab, Notion-class editors)

Key extensions needed:

Markdown shortcuts (type # for heading)
Code blocks with syntax highlighting
Task lists
Mention suggestions (@agent, /skill)
File embedding

Real-Time Streaming: Vercel AI SDK + SSE

Choice: Vercel AI SDK v6 streaming via useChat() hook and toUIMessageStreamResponse()

Rationale:

AI SDK handles streaming natively: streamText() produces SSE-compatible streams that useChat() consumes automatically
No manual SSE implementation needed: The SDK abstracts EventSource, reconnection, and token assembly
Tool call streaming: Tool use blocks stream inline with text tokens, enabling real-time tool call notifications in the UI
WebSocket for collaboration (Phase 2+): Bi-directional needed for real-time presence, typing indicators, concurrent edit notifications
No need for WebSocket until Team tier features ship

Implementation:

Client → useChat() hook sends message via POST /api/chat
Server → streamText() generates response with tools
Server → toUIMessageStreamResponse() converts to stream
Client → useChat() renders tokens incrementally

2.2 Backend Architecture

Monolith on Next.js API Routes

Choice: All backend logic runs as Next.js API routes deployed on Vercel — no separate backend service.

Rationale:

Single deployment simplifies ops, CI/CD, and debugging
Shared TypeScript types between frontend and API routes — zero API contract drift
Vercel's serverless functions handle scaling automatically
API routes support streaming responses natively (critical for LLM output)
Eliminates the cost and complexity of a separate backend service
Vercel AI SDK is designed for this exact architecture pattern

Key API Routes:

Route	Method	Description
`/api/chat`	POST	Main agent/skill execution — streams LLM responses
`/api/drive/callback`	GET	Google Drive OAuth callback
`/api/drive/files`	GET/POST	File CRUD via Google Drive API
`/api/webhooks/stripe`	POST	Stripe subscription/payment events
`/api/webhooks/clerk`	POST	User lifecycle events
`/api/usage`	GET	Usage metering and limits
`/api/billing`	GET/POST	Subscription management
`/api/health`	GET	Health check

Runtime: Node.js 22 LTS

Choice: Node.js 22 (LTS) with TypeScript

Rationale:

Vercel AI SDK and Anthropic's official TypeScript SDK are first-class on Node.js
Non-blocking I/O is ideal for concurrent LLM API calls (agent orchestration)
Shared TypeScript types between frontend and backend within the same codebase
Mature ecosystem for auth, billing, storage SDKs
Node.js 22 LTS provides native fetch, structuredClone, and improved performance

LLM Integration: Vercel AI SDK v6 with Multi-Provider Support

Choice: Vercel AI SDK v6 (ai package) with provider adapters

Architecture:

User Request




Prompt Assembler (compiles system prompt + agent persona + skills + context)




Vercel AI SDK streamText() / generateText()
    |  - Provider: @ai-sdk/anthropic, @ai-sdk/openai, or @ai-sdk/google
    |  - Model: selected per user's BYOT configuration
    |  - Tools: file ops, search, sub-agent spawn




Tool Executor (handles tool_use calls against cloud storage)




Stream Response to Client (via toUIMessageStreamResponse())

Token Source Model — BYOT (Default) + Legionis Tokens (Convenience):

Users choose how they pay for AI. BYOT is the default and recommended path. Legionis Tokens is a convenience alternative for teams that don't want to manage API keys.

Tier	BYOT (Default)	Legionis Tokens (Alternative)
Free	Platform-provided (limited)	Haiku only, 500 ops/month cap
Pro ($10/user/mo)	User connects API key — zero markup	Platform-provided — 30% service fee on provider rates
Team ($7/user/mo)	User connects API key — zero markup	Platform-provided — 30% service fee on provider rates
Full Org ($25/user/mo)	User connects API key (any provider)	Platform-provided — 30% service fee on provider rates

BYOT (Bring Your Own Tokens) — Recommended:

Users connect their own Anthropic, OpenAI, or Google API keys
Zero markup — pay providers directly at their rates
Full cost transparency and control
Users have existing API contracts with volume discounts
No platform LLM costs for paid tiers (only infrastructure)

Legionis Tokens — Convenience Option:

Users opt in via Settings → AI Configuration → "Use Legionis Tokens"
Platform routes requests through Legionis-owned API key pool
30% service fee on top of provider rates (disclosed transparently)
No API key setup required — reduces onboarding friction
Metered billing via Stripe Usage Records: platform subscription + token consumption
Spend dashboard: real-time usage, daily/monthly breakdown, configurable spend alerts
Users can switch between BYOT and Managed at any time

Quality/Efficiency Toggle (applies to both BYOT and Managed):

Three-position global setting controlling the cost-quality tradeoff:

Setting	Behavior	Use Case
Maximum Quality	All agents use top-tier models (Opus/GPT-4o) regardless of SKILL.md default	Critical strategic work, high-stakes deliverables
Balanced (default)	Each agent uses its SKILL.md-specified model (sonnet for OS, opus for Extension Teams)	Daily operations — optimized by agent designers
Maximum Efficiency	All agents use fastest models (Haiku/GPT-4o-mini)	High-volume tasks, cost-sensitive teams, quick lookups

Implementation: applyQualityPreference() intercepts the agent's model preference before resolveModel() in provider-factory.ts. Stored as quality_preference on the workspaces table. Per-agent model overrides available as a post-launch enhancement (Week 15).

Prompt Caching: Anthropic's prompt caching remains beneficial for all users — skill templates, rules, and agent personas get cached, reducing API costs whether BYOT or Managed.

Prompt Assembler Architecture

The Prompt Assembler is the core translation layer between the compiled agent/skill definitions and the Vercel AI SDK.

+---------------------------------------------+

Prompt Assembler


1. DETECT intent (skill, agent, gateway)
2. LOAD compiled personas from JSON
- Agent persona files (81 agents)
- Skill definitions (80+ skills)
- Rule files (relevant subset)
- User's context files (@file refs)
3. ASSEMBLE system prompt
- Agent Identity Protocol (if agent)
- Skill template (if skill)
- Relevant rules (not all -- selected)
- User context (decisions, bets, etc.)
4. CONFIGURE tools via AI SDK tool()
- File operations (Read, Write, Edit)
- Search operations (Glob, Grep)
- Skill invocation (nested)
- Sub-agent spawn
5. INVOKE via AI SDK
- streamText() or generateText()
- Provider selected per BYOT config
- Tools enabled
6. PROCESS tool calls
- Execute against cloud storage
- Return results to model
- Continue until complete
7. STREAM response to client
- toUIMessageStreamResponse()
- Tool call notifications inline
- Final response with ROI

+---------------------------------------------+

Prompt Assembler

1. DETECT intent (skill, agent, gateway)
2. LOAD compiled personas from JSON
- Agent persona files (81 agents)
- Skill definitions (80+ skills)
- Rule files (relevant subset)
- User's context files (@file refs)
3. ASSEMBLE system prompt
- Agent Identity Protocol (if agent)
- Skill template (if skill)
- Relevant rules (not all -- selected)
- User context (decisions, bets, etc.)
4. CONFIGURE tools via AI SDK tool()
- File operations (Read, Write, Edit)
- Search operations (Glob, Grep)
- Skill invocation (nested)
- Sub-agent spawn
5. INVOKE via AI SDK
- streamText() or generateText()
- Provider selected per BYOT config
- Tools enabled
6. PROCESS tool calls
- Execute against cloud storage
- Return results to model
- Continue until complete
7. STREAM response to client
- toUIMessageStreamResponse()
- Tool call notifications inline
- Final response with ROI

Skill Loading Strategy:

Skills and agent personas are compiled from markdown to JSON at build time (compile-prompts.ts)
Compiled JSON files are committed to the repo (Vercel doesn't run submodule scripts)
Skill loading is lazy — only requested skill/persona loaded per invocation
Rule loading is selective — only rules relevant to the current skill/domain injected
Anthropic prompt caching caches the combined system prompt for repeat invocations

Agent Orchestrator Architecture

The Agent Orchestrator manages agent spawning, tool execution, and response synthesis.

Single-Agent Orchestration

User: "@pm create a PRD for authentication"




Orchestrator detects @agent syntax




Constructs agent prompt via Prompt Assembler:
  - Agent Identity Protocol (emoji, name, response rules)
  - Agent persona (compiled from SKILL.md)
  - Relevant skills available to agent
  - User's context files
  - Task description




Calls streamText() with agent's system prompt + tools
  - Agent can use tools (Read, Write, Edit, Glob, Grep)
  - Agent can invoke skills (nested skill execution)




Streams agent response back to user's chat
  - With attribution (emoji + display name)
  - With ROI display




Logs interaction to context/interactions/

Multi-Agent Orchestration (Parallel + Meeting Mode)

User: "@plt should we prioritize webhooks or SDK?"




Gateway skill detects PLT invocation




Determines agent composition (e.g., VP Product, Dir PM, Dir PMM, BizOps)




Spawns N agents IN PARALLEL
  - Each agent gets same prompt + their persona
  - Each runs independently via separate generateText() calls
  - Promise.all() waits for all agents




Meeting Mode Assembly:
  - Individual responses shown with attribution
  - Points of alignment extracted
  - Points of tension extracted
  - Synthesis generated (by orchestrator using a final generateText() call)




Aggregate ROI calculated




Full Meeting Mode response streamed to client

Key Design Decisions:

Each agent spawn = separate AI SDK call (isolation)
No shared state between parallel agents (prevents race conditions)
Orchestrator itself uses a generateText() call for Meeting Mode synthesis
Max parallel agents: 4 (configurable, prevents API rate limit issues)
Agent timeout: 60s per agent, 120s for PLT session
Sub-agent depth limit: 2 (agent can spawn sub-agent, sub-agent cannot spawn further)

2.3 Infrastructure

Repository: Next.js Monolith

Choice: Single Next.js application with colocated API routes

Structure:

legionis/
+-- src/
|   +-- app/                    # Next.js App Router pages
|   |   +-- (auth)/             # Auth pages (sign-in, sign-up)
|   |   +-- (dashboard)/        # Main app (workspace, chat, settings)
|   |   +-- api/                # API routes (the backend)
|   |   |   +-- chat/           # Agent/skill execution endpoint
|   |   |   +-- drive/          # Google Drive proxy routes
|   |   |   +-- webhooks/       # Stripe, Clerk webhooks
|   |   |   +-- usage/          # Usage metering
|   |   |   +-- billing/        # Subscription management
|   |   |   +-- health/         # Health check
|   |   +-- layout.tsx          # Root layout
|   +-- components/             # React components
|   +-- lib/                    # Core libraries
|   |   +-- agent/              # Agent runtime (AI SDK wrappers)
|   |   +-- db/                 # Drizzle schema + queries
|   |   +-- drive/              # Google Drive client
|   |   +-- prompt/             # Prompt compilation + caching
|   |   +-- stripe/             # Stripe helpers
|   |   +-- utils/              # Shared utilities
|   +-- tools/                  # Custom AI SDK tool definitions
|   +-- personas/               # Compiled agent personas (JSON)
|   +-- middleware.ts           # Clerk auth middleware
+-- os-source/                  # Git submodules
|   +-- product-org-os/         # PUBLIC: 13 agents, 61 skills
|   +-- extension-teams/        # PRIVATE: 68 agents, 34 knowledge packs
+-- scripts/
|   +-- compile-prompts.ts      # SKILL.md -> compiled persona JSON
|   +-- seed-skills.ts          # Seed skill metadata into DB
|   +-- migrate.ts              # DB migration runner
+-- drizzle/                    # Migration files
+-- package.json

Rationale:

Single deployment on Vercel — no separate backend to manage
Shared TypeScript types across frontend and API routes eliminate contract drift
Vercel handles scaling, CDN, and edge middleware automatically
Simplified CI/CD — one build, one deploy
Lower infrastructure cost vs. separate frontend + backend services

Hosting: Vercel (Single Deployment)

Choice: Vercel Pro plan for the entire application

Rationale:

Native Next.js deployment (Vercel created Next.js)
Automatic preview deployments per PR
Edge Functions for auth middleware, geolocation-based routing
Built-in analytics and Web Vitals monitoring
Global CDN for static assets
Serverless functions scale automatically — no container management
Cost: $20/mo Pro plan (sufficient through scale phase)

Scaling Strategy:

Launch: Vercel Pro + single Neon PostgreSQL instance
Growth (1K+ users): Add Redis for caching via Upstash, background jobs via Vercel Cron
Scale (5K+ users): Vercel handles horizontal scaling of serverless functions automatically
Enterprise: Evaluate Vercel Enterprise or migration to AWS if Vercel limits reached

Database: PostgreSQL via Neon

Choice: PostgreSQL 17 hosted on Neon

Rationale:

PostgreSQL is the industry standard for transactional data
Neon provides serverless PostgreSQL with branching (dev/staging/prod from same source)
Auto-scaling: scales to zero when idle, scales up under load
Connection pooling via PgBouncer built-in
Point-in-time recovery included
Cost: Free tier for development, $19/mo for production (sufficient through scale phase)

ORM: Drizzle ORM

Rationale:

TypeScript-native with zero abstraction over SQL
Schema defined in TypeScript (type-safe migrations)
Generates raw SQL (debuggable, no query builder magic)
Lightweight (~35KB vs Prisma's ~3MB engine binary)
First-class support for PostgreSQL features (JSONB, arrays, enums)

Schema Overview (key tables):

Table	Purpose	Key Columns
`users`	Auth, profile	id, email, name, tier, created_at
`workspaces`	User/team workspace	id, owner_id, name, storage_path
`conversations`	Chat threads	id, workspace_id, title, created_at
`messages`	Chat messages	id, conversation_id, role, content, tokens_in, tokens_out
`documents`	Generated files	id, workspace_id, path, type, skill_used, created_at
`usage_events`	Billing metering	id, user_id, operation_type, model, tokens, cost, timestamp
`context_entries`	Decisions, bets, etc.	id, workspace_id, type, content, metadata
`api_keys`	Encrypted BYOT keys	id, user_id, provider, encrypted_key, created_at
`subscriptions`	Stripe subscription state	id, user_id, stripe_id, tier, status

Object Storage: Cloudflare R2

Choice: Cloudflare R2

Rationale:

S3-compatible API (use standard AWS SDK)
Zero egress fees (significant cost advantage over AWS S3)
Fast reads via Cloudflare's global network
Cost: $0.015/GB/mo storage, no bandwidth charges
Perfect for user workspace files (markdown, JSON, HTML outputs)

Storage Structure (per workspace):

workspaces/{workspace_id}/
+-- context/
|   +-- decisions/
|   +-- bets/
|   +-- feedback/
|   +-- learnings/
|   +-- interactions/
|   +-- portfolio/
|   +-- documents/
+-- deliverables/       # PRDs, roadmaps, presentations
+-- uploads/            # User-uploaded files
+-- .metadata.json      # Workspace metadata

Search: Typesense Cloud

Choice: Typesense Cloud

Rationale:

Typo-tolerant, instant search over workspace documents
Easy to index markdown content (skill outputs, decisions, context)
Faceted search (filter by type, date, tags, skill)
Low latency (<50ms for most queries)
Cost: $0.06/hour (~$44/mo) for starter tier
Simple REST API, no complex query language

Use Cases:

Search across context layer (decisions, bets, feedback, learnings)
Find documents by content, title, tags
Skill discovery (search 80+ skills by name or description)
Conversation history search

Authentication: Clerk

Choice: Clerk

Rationale:

Pre-built UI components (sign-in, sign-up, user profile) that match modern SaaS standards
Social login (Google, Microsoft, LinkedIn) + email/password out of the box
Session management with JWT (stateless, works with API routes)
Organization/team support built-in (needed for Team tier)
Webhook integration for user lifecycle events
RBAC via Clerk Organizations (Owner, Editor, Viewer roles)
SSO/SAML support available for Enterprise tier
Cost: Free up to 10K MAU, $25/mo for Pro

Payments: Stripe

Choice: Stripe Billing + Stripe Checkout

Rationale:

Industry standard for SaaS billing
Supports per-seat pricing (needed for Team and Full Org tiers)
Checkout handles PCI compliance, payment forms, 3D Secure
Billing portal for self-service plan changes, invoice access
Webhook integration for subscription lifecycle events
Tax automation (Stripe Tax) for international sales
Cost: 2.9% + $0.30 per transaction

Billing Architecture:

+------------------------------------------+

Billing Service


1. Subscription Management
- Free: No Stripe subscription
- Pro: $10/user/mo
- Team: $7/user/mo (per seat)
- Add-on: +$5/user/mo
- Full Org: $25/user/mo (per seat)

2. Usage Tracking
- Count operations per user/month
- Free tier: enforce 500 ops cap

3. Webhook Processing
- payment_succeeded -> activate
- payment_failed -> grace period
- subscription_canceled -> downgrade

+------------------------------------------+

Billing Service

1. Subscription Management
- Free: No Stripe subscription
- Pro: $10/user/mo
- Team: $7/user/mo (per seat)
- Add-on: +$5/user/mo
- Full Org: $25/user/mo (per seat)

2. Usage Tracking
- Count operations per user/month
- Free tier: enforce 500 ops cap

3. Webhook Processing
- payment_succeeded -> activate
- payment_failed -> grace period
- subscription_canceled -> downgrade

Stripe Products Configured (test mode):

Pro: $10/user/mo (price_1T0lKDCzBUEHrjdq...)
Team: $7/seat/mo
Add-on Module: +$5/user/mo
Full Org: $25/user/mo

Monitoring & Observability

Choice: Sentry (errors) + PostHog (product analytics) + Better Stack (logs/uptime)

Tool	Purpose	Cost
Sentry	Error tracking, performance monitoring, session replay	$26/mo (Team)
PostHog	Product analytics, feature flags, A/B testing, session recordings	Free up to 1M events
Better Stack	Log aggregation, uptime monitoring, status page	$24/mo

Custom Telemetry (built in-house):

Operation metering (for billing accuracy — cannot depend on third-party)
Token usage tracking per user/tier/model (for COGS analysis)
Skill-level latency histograms (for SLA monitoring)
Agent orchestration metrics (spawn count, latency, success rate)

2.4 P0 Architecture Requirements (Pre-M0 Blockers)

Following the Architecture Team review (2026-01-29), the following 8 items are P0 blockers that must be addressed before M0 development begins.

Implementation Details: See architecture-spec-m0.md for full technical specifications, code examples, and validation criteria.

2.4.1 API Key Encryption

User-provided LLM API keys (for the Your API Keys model) must be encrypted at rest using envelope encryption with KMS. Plaintext keys must never be stored or logged.

Owner: Security Architect | Risk if unaddressed: Data breach

2.4.2 Row-Level Security (RLS) Enforcement

PostgreSQL RLS policies must enforce multi-tenant data isolation on all workspace-scoped tables. Cross-tenant access must be impossible at the database level.

Owner: Data Architect + Security Architect | Risk if unaddressed: Data leakage between tenants

2.4.3 Prompt Injection Mitigation

Input sanitization, sandboxed context injection, output validation, and rate limiting must prevent prompt injection attacks via user-provided files and context.

Owner: Security Architect + AI Architect | Risk if unaddressed: System compromise

2.4.4 Connection Pool Configuration

Neon PostgreSQL connection pooling must be configured with appropriate pool sizes per phase (20 -> 50 -> 100 connections) and timeout settings.

Owner: Chief Architect + Data Architect | Risk if unaddressed: Performance degradation under load

2.4.5 API Versioning Strategy

URL path versioning (/v1/resource) with 6-month deprecation windows for breaking changes. Critical for future Enterprise API tier.

Owner: API Architect | Risk if unaddressed: Breaking changes for API consumers

2.4.6 SSE Event Type Contract

Streaming response events must have defined types for tokens, tool calls, agent lifecycle, errors, and completion. Clients must be able to distinguish event types.

Owner: API Architect | Risk if unaddressed: Frontend integration issues

2.4.7 Prompt Versioning System

Skill templates, agent personas, and rules must support semantic versioning, A/B testing, instant rollback, and per-version metrics tracking.

Owner: AI Architect | Risk if unaddressed: Slow prompt iteration, no rollback capability

2.4.8 Distributed Tracing

Request IDs must propagate across agent spawns to enable debugging of multi-agent sessions. Full trace trees must be queryable in logs.

Owner: Cloud Architect | Risk if unaddressed: Cannot debug PLT or multi-agent issues

2.5 Agent Roster

Legionis provides 81 agents organized across 11 specialized teams, with 9 team gateways for automatic routing.

Product Org (Core) — 13 Agents, 2 Gateways

The core team, providing comprehensive product management capabilities. One of eleven equal department teams.

Agent	Emoji	Role
Product Manager	📝	Requirements, PRDs, user stories, delivery
VP Product	📈	Vision, portfolio, strategic decisions
Director PM	📋	Roadmaps, prioritization, team management
Director PMM	📣	GTM, positioning, competitive response
Product Marketing Manager	🎯	Messaging, campaigns, launch execution
CPO	👑	Enterprise strategy, org design
BizOps	🧮	Financial analysis, KPIs, business cases
Competitive Intelligence	🔭	Competitor analysis, win/loss, market intel
Product Operations	⚙️	Launch readiness, process, tooling
UX Lead	🎨	User research, design, usability
Value Realization	💰	Customer outcomes, adoption, churn
BizDev	🤝	Partnerships, market expansion
Product Mentor	🎓	Career development, PM coaching

Gateways: @product (single/multi-agent routing), @plt (Product Leadership Team meetings)

Design Team — 7 Agents

Agent	Emoji	Role
Director of Design	🎨	Design leadership, systems, standards
UI Designer	🖼️	Components, interfaces, layouts
Visual Designer	🎭	Branding, graphics, visual identity
Interaction Designer	👆	UX flows, prototypes, patterns
User Researcher	👤	Research, usability testing, personas
Motion Designer	🎬	Animation, transitions, micro-interactions
Design Coach	🖌️	Design career development

Gateway: @design

Architecture Team — 7 Agents

Agent	Emoji	Role
Chief Architect	🏗️	System architecture, technical strategy
API Architect	🔌	API design, integrations, contracts
Data Architect	📊	Data modeling, schemas, databases
Security Architect	🔐	Security review, auth, threat modeling
Cloud Architect	☁️	Infrastructure, deployment, scaling
AI Architect	🤖	AI/ML architecture, LLM patterns, RAG
Architecture Coach	🔧	Architecture career development

Gateway: @architecture

Marketing Team — 16 Agents

Agent	Emoji	Role
CMO	🎙️	Marketing strategy, brand, org design
Director of Marketing	📢	Campaign strategy, team leadership
Content Strategist	✍️	Content strategy, editorial calendar
Copywriter	✏️	Messaging, landing pages, copy
SEO Specialist	🔍	Organic search, keywords
CRO Specialist	📈	Conversion optimization, A/B tests
Paid Media Manager	💰	Ads, media buying, campaigns
Email Marketer	📧	Email campaigns, sequences, automation
Social Media Manager	📱	Social media, community
Growth Marketer	🚀	Growth strategy, acquisition loops
Market Researcher	🔬	Market sizing, surveys, analysis
Video Producer	🎥	Video content, production
PR/Comms Specialist	📣	PR, communications, press
Presentation Designer	📑	Slide decks, pitch materials
Infographic Designer	📊	Data visualization, infographics
Marketing Coach	🎤	Marketing career development

Gateway: @marketing

Finance Team — 9 Agents

Agent	Emoji	Role
CFO	💰	Financial strategy, capital allocation
Director of Finance	💼	Financial planning, budgets
FP&A Analyst	📉	Forecasting, variance analysis
Revenue Analyst	💵	Revenue modeling, unit economics
Investor Relations	🏦	Fundraising, investor materials
Financial Controller	🧾	Reporting, audit, controls
Treasury Analyst	💳	Cash flow, working capital
Tax Specialist	📋	Tax planning, R&D credits
Finance Coach	🏫	Finance career development

Gateway: @finance

Legal Team — 8 Agents

Agent	Emoji	Role
General Counsel	⚖️	Legal strategy, risk governance
Director Legal Affairs	📂	Legal operations, team management
Contracts Counsel	📜	Contract review, vendor agreements
Privacy Counsel	🛡️	GDPR, CCPA, data protection
IP Counsel	💡	Patents, trade secrets, licensing
Compliance Officer	✅	SOC2, ISO, regulations
Employment Counsel	👔	Employment law, policies
Legal Coach	📚	Legal career development

Gateway: @legal

Operations Team — 8 Agents

Agent	Emoji	Role
COO	🏢	Operational strategy, org design
Director of Operations	🔄	Process improvement, management
Program Manager	📋	Cross-functional programs
Project Manager	📊	Project planning, execution
Procurement Specialist	🛒	Vendor management, RFPs
Process Engineer	⚙️	Process mapping, automation
Risk Manager	🛡️	Enterprise risk, BCP/DR
Operations Coach	🎓	Operations career development

Gateway: @operations

Executive Team — 2 Agents (+ Shared C-Levels)

Agent	Emoji	Role
CEO	🎯	Enterprise strategy, vision, board prep
Executive Coach	🎓	Leadership development

Gateway: @executive (also routes to CPO, CFO, COO, CMO, CIO, General Counsel as needed)

Corp Dev Team — 5 Agents

Agent	Emoji	Role
Head of Corp Dev	🏛️	Corp dev strategy, deal pipeline
M&A Analyst	🔍	Acquisitions, due diligence, valuation
Strategic Partnerships	🤝	Alliances, JVs, partnerships
Corporate Venture	💎	CVC, startup investments
Corp Dev Coach	🎓	Corp dev career development

Gateway: @corpdev

IT Governance Team — 6 Agents

Agent	Emoji	Role
CIO	💻	IT strategy, COBIT governance
Director of IT	🖥️	IT operations, ITIL
IT Security Policy	🔒	NIST CSF, CIS Controls
Enterprise Systems	🏢	ERP/CRM, SaaS portfolio
Data Governance	📊	DAMA, data quality, MDM
IT Coach	🎓	IT career development

Gateway: @it

Summary

Metric	Count
Total Agents	81
Teams	11
Team Gateways	9
OS Gateways	2
Total Gateways	11
Skills	80+
Knowledge Packs	34

2.6 Design System & Brand

Color Palette

Legionis uses a stone-based dark palette with warm amber accents:

Token	Hex	Usage
`stone-900`	`#1c1917`	Primary background (dark mode)
`stone-800`	`#292524`	Card backgrounds, panels
`stone-700`	`#44403c`	Borders, dividers
`stone-600`	`#57534e`	Muted text, secondary UI
`stone-400`	`#a8a29e`	Body text
`stone-200`	`#e7e5e4`	Primary text
`stone-50`	`#fafaf9`	High-emphasis text
`amber-600`	`#d97706`	Primary accent (buttons, links, active states)
`gold-500`	`#f59e0b`	Secondary accent (highlights, badges)

Typography

Role	Font	Weight
Headings	Space Grotesk	600 (Semi-Bold), 700 (Bold)
Body	Inter	400 (Regular), 500 (Medium)
Code / Technical	JetBrains Mono	400 (Regular)

Logo

The Formation L — a geometric "L" mark representing formation, structure, and the legion concept. Used as favicon, app icon, and brand mark.

Design Principles

Dark-first: Dark mode is the primary experience (stone-900 base), light mode is secondary

Warm industrial: Stone palette with amber accents evokes professionalism without coldness

Information density: Power users want dense UIs — maximize content per viewport

Keyboard-first: Every action should be accessible via keyboard shortcuts

2.7 Architecture Decisions Summary

Decision	Choice	Key Rationale
Monolith vs Microservices	Monolith (Next.js API routes)	Single deployment, shared types, zero API contract drift, lower ops cost
Serverless vs Containers	Serverless (Vercel functions)	Auto-scaling, no container management, pay-per-use at scale
Edge vs Origin	Origin (with CDN for static)	LLM API calls require server; edge unnecessary for launch scale
SSE vs WebSocket	AI SDK streaming (SSE under the hood), WebSocket for presence later	AI SDK handles streaming natively; WebSocket only when needed
SQL vs NoSQL	SQL (PostgreSQL)	Transactional integrity for billing, well-structured domain model
ORM vs Raw SQL	ORM (Drizzle)	Type safety with minimal abstraction, lightweight
Self-hosted vs Managed	Managed (Neon, R2, Typesense Cloud)	Minimize ops overhead for small team
LLM SDK	Vercel AI SDK v6	Multi-provider support (BYOT), native streaming, built for Next.js

2.8 Infrastructure Cost Estimate

Note: See architecture-stack.md for fully researched and verified pricing from official sources (January 2026).

Service	Launch	Growth (1K users)	Scale (5K users)	Source
Vercel (app + API)	$20/mo	$20/mo	$20/mo	[Vercel Pricing](https://vercel.com/pricing)
Neon (database)	$19/mo	$19-69/mo	$69/mo	[Neon Pricing](https://neon.com/pricing)
Cloudflare R2 (storage)	<$1/mo	<$1/mo	$3/mo	[R2 Pricing](https://developers.cloudflare.com/r2/pricing/)
Typesense Cloud (search)	$40/mo	$60/mo	$100/mo	[Typesense Pricing](https://cloud.typesense.org/pricing)
Clerk (auth)	$25/mo	$25-100/mo	$25-825/mo	[Clerk Pricing](https://clerk.com/pricing)
Sentry	$29/mo	$29/mo	$29/mo	[Sentry Pricing](https://sentry.io/pricing/)
PostHog	$0/mo	$0/mo	$0/mo	[PostHog Pricing](https://posthog.com/pricing)
Better Stack	$29/mo	$29/mo	$29/mo	[Better Stack Pricing](https://betterstack.com/pricing)
Infrastructure Total	~$162/mo	~$280/mo	~$475/mo

Key Insight: By eliminating the separate backend service (Railway ~$50-200/mo), total infrastructure cost drops by ~$50-200/mo vs. the v2.1 architecture. The Vercel monolith approach is significantly cheaper.

Cloud storage APIs (Google Drive, OneDrive, Dropbox) are free — they use quota/rate limit models, not pay-per-use. This is a major cost advantage for the cloud-storage-first architecture.

Your API Keys Cost Model: With the Your API Keys model, the platform has zero LLM API costs for paid tiers. Users pay their own LLM providers directly. This eliminates the COGS risk entirely and makes unit economics straightforward: infrastructure cost / number of users.

Free Tier COGS: For free tier users (platform-provided Haiku), estimated COGS is ~$0.50/user/month at 500 ops cap with Haiku-only routing.

2.9 Cloud Storage Integration (Core Feature)

Legionis is cloud-storage-first — users connect their existing Google Drive, OneDrive, or Dropbox rather than uploading files to a proprietary storage system.

Why This Matters

Stable File IDs: Cloud providers assign permanent IDs to files. When a skill creates context/decisions/DR-2026-001.md, we store the cloud file ID. Future references resolve via ID, not path. The user can rename or reorganize and all context references remain valid.

No Vendor Lock-In: Documents live in the user's cloud storage. They can access them outside Legionis, share them via normal Drive/OneDrive sharing, and export by simply... not exporting (files are already theirs).

Zero Sync Conflicts: No desktop agent means no "which version is correct?" problems. The cloud storage IS the source of truth.

OAuth Flow

1. User signs up (email or social login via Clerk)
User clicks "Connect Storage" -> Google Drive (primary), OneDrive, or Dropbox
OAuth consent screen (we request minimal scopes: file read/write in a specific folder)
User selects or creates a "Legionis Workspace" folder
Backend creates workspace structure in that folder
User is ready to use skills and agents

Workspace Structure (in User's Cloud Storage)

Legionis Workspace/              <- User-selected or created folder
+-- context/                         <- Organizational memory
|   +-- decisions/
|   +-- bets/
|   +-- feedback/
|   +-- learnings/
|   +-- interactions/
|   +-- portfolio/
|   +-- documents/
+-- deliverables/                    <- Generated PRDs, roadmaps, etc.
+-- .workspace.json                  <- Metadata

API Costs

Provider	API Cost	Rate Limits
Google Drive	FREE	12,000 queries/day (default)
Microsoft OneDrive	FREE	Throttled above 10K requests/10min
Dropbox	FREE	Rate limited per app

All three providers use quota-based models, not pay-per-call. This is a significant cost advantage.

2.10 External Tool Integration Architecture

Legionis agents can connect to external professional tools to perform real operations — creating Jira tickets, posting Slack messages, querying analytics, reviewing contracts, and more. This is a force multiplier: agents don't just advise, they act.

Architecture: Hybrid OAuth

Users click "Connect [Tool]" → standard OAuth 2.0 flow → Legionis stores encrypted tokens → agents call tool APIs directly using stored credentials. The OS integration templates (15 templates, 47 platforms) inform what API calls agents make; the platform provides the authenticated HTTP client.

User clicks "Connect Jira" in Settings → Connections




OAuth 2.0 Authorization Code flow




Legionis stores encrypted tokens in tool_connections table




Agent needs to create Jira ticket →




Platform checks: user has Jira connected?
    |--- YES → Execute API call with stored tokens → Return result to agent
    |--- NO  → Graceful fallback: "Here are the tickets to create manually: [list]"

Why Hybrid OAuth (not MCP in production):

MCP is a Claude Code local runtime pattern — users would need to self-host MCP servers (terrible SaaS UX)
OAuth is the standard SaaS integration pattern every user understands
Same trust model as BYOT: users already trust us with encrypted API keys; OAuth tokens are equivalent
OS integration templates define the API operations; platform executes them directly
Graceful fallback: agents work without any connections, connections make them more powerful

Integration Categories (15 Templates, ~47 Platforms)

Category	Template	Team(s)	Platforms	Auth Type
Project Management	`project-management.md`	Operations, Product	Jira, Asana, Monday.com, Linear	OAuth2 / API key
Process & Collaboration	`process-collaboration.md`	Operations	Miro, Lucidchart, Notion, Confluence	OAuth2
Design	`figma.md`	Design	Figma	API key (PAT)
Code & Repository	`github-arch.md`	Architecture	GitHub	OAuth2
Analytics	`analytics.md`	Marketing	GA4, Amplitude, Mixpanel, PostHog, Heap	Service acct / API key
SEO	`seo-tools.md`	Marketing	Ahrefs, Semrush, Google Search Console, Moz	API key / Service acct
Email Marketing	`email-platform.md`	Marketing	Mailchimp, SendGrid, HubSpot, Klaviyo, Customer.io	API key / OAuth2
Accounting	`accounting.md`	Finance	Xero, QuickBooks Online, FreshBooks, Sage	OAuth2 (complex)
SaaS Metrics	`saas-metrics.md`	Finance	Stripe Billing, ChartMogul, Baremetrics, ProfitWell	API key
Contract Management	`contract-management.md`	Legal	DocuSign CLM, PandaDoc, Ironclad, Juro	OAuth2 / JWT
Compliance	`compliance-platforms.md`	Legal	Vanta, Drata, OneTrust, TrustArc	API key / OAuth2
CRM & Deal Tracking	`crm-deal-tracking.md`	Corp Dev	Salesforce, HubSpot, Pipedrive, DealCloud	OAuth2 (complex)
Data & Research	`data-research.md`	Corp Dev	PitchBook, Crunchbase, CB Insights, Apollo.io, Hunter.io	API key
ITSM	`itsm.md`	IT Governance	ServiceNow, Jira Service Management, Freshservice	OAuth2 (complex)
Identity & Access	`identity-access.md`	IT Governance	Okta, Microsoft Entra ID, Google Workspace Admin	OAuth2 (admin consent)

Agent-Integration Mapping

Every team benefits from at least one integration category:

Team	Integration Categories	Key Workflows
Product	Project Mgmt, Communication	PRD → auto-create Jira stories; share decisions to Slack
Design	Figma	Design review with real component data; brand consistency checks
Architecture	GitHub	Security review against actual code; PR analysis; issue creation
Marketing	Analytics, SEO, Email, Communication	Campaign stats, funnel analysis, keyword research, A/B test results
Finance	Accounting, SaaS Metrics	Financial health review, revenue modeling, cash flow analysis
Legal	Contract Mgmt, Compliance	Contract review, audit preparation, compliance posture monitoring
Operations	Project Mgmt, Process & Collaboration	Sprint health, cross-project dependencies, SOP management
Corp Dev	CRM, Data & Research	M&A pipeline, comparable analysis, partnership tracking
IT Governance	ITSM, Identity & Access	Incident analysis, access reviews, security policy audit

Graceful Degradation

Agents ALWAYS work without integrations. Connections enhance, never gate, agent capability.

Integration Connected	Agent Behavior
YES	Agent calls API directly: "I created PROJ-101, PROJ-102, and PROJ-103 in your Jira backlog."
NO	Agent produces text deliverable: "Here are the user stories ready for your project tracker: [table]. Next Steps (Manual): Create these in Jira."
Expired token	Agent detects 401, notifies: "Your Jira connection has expired. Reconnect in Settings → Connections." Falls back to text output.

In-chat amber nudge when an unconnected tool would help: "I can create these Jira tickets directly if you connect Jira in Settings → Connections."

Security Model

All OAuth tokens encrypted using the same envelope encryption as API keys (AES-256-GCM)
Minimum-viable scopes requested per service (e.g., read:jira-work, write:jira-work — not broad admin)
Finance and Legal integrations default to read-only — agents never create/modify/delete financial or legal records without explicit user confirmation
IT Governance integrations mark data as highly sensitive — all queries logged for audit
All external API calls logged in integration_audit_log table (timestamp, user_id, service, operation, status)
Token refresh handled automatically; expired tokens surface a reconnect prompt, never fail silently

Settings UX: Connections Page

Dedicated settings page (Settings → Connections) with marketplace-style grid:

Each card: tool logo, name, connection status (green dot / "Connect" button)
"Used by X agents" indicator per card (e.g., "Used by 4 agents: PM, ProdOps, UX Lead, CI")
Grouped by category: Project Management, Communication, Design, Code, Analytics, Finance, Legal, IT
Connect → OAuth flow → redirect back → status updates
Disconnect with confirmation
Token health indicators (connected, expired, error)

Rollout Plan

Wave	Timeline	Integrations	Platforms
Wave 1	Weeks 11-12 (post-launch)	Project Mgmt, Communication, Code	Jira, Linear, Slack, GitHub
Wave 2	Week 12	Design, Marketing	Figma, GA4, Mailchimp/SendGrid/HubSpot, Search Console
Wave 3	Weeks 13-14	Finance, Legal, IT, Corp Dev, Knowledge	Xero/QBO, Stripe, DocuSign/PandaDoc, Vanta/Drata, ServiceNow/JSM, Okta/Entra, Salesforce/HubSpot CRM, PitchBook/Crunchbase, Notion/Confluence, Miro

Tool integrations are not a launch blocker. The graceful fallback pattern means launch works perfectly with zero connected tools.

2.11 Expected User Journey

First 5 Minutes (Time to First Value)

0:00 - User lands on legionis.ai 0:30 - User clicks "Get Started Free" 1:00 - User signs up with Google (one-click via Clerk) 1:30 - User clicks "Connect Google Drive" 2:00 - User grants OAuth permissions (pre-scoped to single folder) 2:30 - User selects/creates workspace folder 3:00 - Workspace initialized with context structure 3:30 - User sees main interface (chat panel, file browser, team selector) 4:00 - User types /prd my first product idea 5:00 - PRD generated, saved to Drive, visible in file browser

TIME TO VALUE: ~5 minutes

Week 1 (Habit Formation)

Day	Activity	Context Accumulated
1	Creates first PRD with `@pm`	1 document, 1 interaction
2	Makes a pricing decision with `/decision-record`	1 decision
3	Asks `@marketing` for positioning help	1 marketing deliverable
4	Gets `@finance` to review budget assumptions	1 financial analysis
5	Uses `@plt` for a strategic discussion	1 multi-agent session
7	Uses `/context-recall pricing` — sees all past decisions	Aha moment: context persists across teams

Month 1 (Value Realization)

10+ documents generated across multiple teams
5+ decisions recorded
Context layer shows ROI: "You've saved ~8 hours this month"
User explores more teams, discovers @legal for contract review, @architecture for tech decisions
If Pro: Uses Sonnet/Opus for complex analysis, sees quality improvement

Month 3+ (Switching Cost Established)

50+ documents in workspace
20+ decisions with cross-references
Strategic bets with assumption tracking
Feedback patterns emerging across multiple business functions
Switching to another tool would mean losing all this context

2.11 Input Methods (CLI-Style UX)

Legionis is designed for power users who appreciate CLI-style interaction — not a dumbed-down GUI that hides the system's capabilities.

Primary Input: Text Command Bar

The main input is a text field that accepts:

Syntax	Action	Example
`/skill-name`	Invoke skill	`/prd authentication feature`
`@agent-name`	Spawn agent	`@pm review this PRD`
`@team-gateway`	Route to team	`@marketing plan the launch campaign`
`@file.md`	Reference file	`@pm based on @research.md create a PRD`
Natural language	Intelligent routing	"Help me decide on pricing" -> routes to BizOps

Autocomplete: Typing / shows a searchable skill palette (Cmd+K style). Typing @ shows agents organized by team, with recent agents prioritized.

Secondary Input: Voice Activation (Phase 1+)

Voice input for hands-free operation:

Push-to-talk: Hold a key, speak naturally
Wake word (optional): "Hey Legionis" triggers listening
Voice -> Text: Transcribed and processed as text command
Feedback: Visual indicator during listening, confirmation on submission

Use Cases:

Walking through the office thinking through a problem
Dictating feedback after a customer call
Quick skill invocation while reviewing another document

Tertiary Input: Team & Agent Selector UI

Visual panel showing available teams and their agents:

+----------------------------------------------+

TEAMS


+----------------------------------------------+

> Product Org (13)      > Design (7)

> Architecture (7)      > Marketing (16)
> Finance (9)           > Legal (8)
> Operations (8)        > Executive (2+)
> Corp Dev (5)          > IT Governance (6)

+----------------------------------------------+

PRODUCT ORG

📝 PM           📈 VP Product
📣 Dir PMM      📋 Dir PM
🧮 BizOps       🔭 Competitive Intel
🎨 UX Lead      💰 Value Realization
⚙️ ProdOps      🤝 BizDev
👥 PLT (Team)   🏛️ Product (Gateway)

+----------------------------------------------+

Click a team to expand its agents
Click an agent to prefill @agent in the input
Hover for agent description and typical use cases
Visual indicator when an agent is currently responding
Gateway buttons trigger team-level routing (@marketing, @finance, etc.)

Keyboard Navigation

Shortcut	Action
`Cmd+K`	Open skill palette
`Cmd+Shift+A`	Open agent/team selector
`Cmd+Enter`	Submit message
`Cmd+/`	Focus input
`Esc`	Cancel / close panels
`Cmd+1-9`	Switch between recent conversations

Why This Approach

Power users expect it: Our target users are comfortable with Claude Code, terminal-based tools, and keyboard shortcuts. They don't want wizard-style GUIs.

Efficiency: Typing /prd is faster than clicking through menus. Typing @marketing routes to the right specialist instantly.

Discoverability via autocomplete: The skill palette and team selector provide discoverability without hiding capabilities behind nested menus.

Flexibility: Users can mix input methods — type a command, click an agent, voice a follow-up.

3. User Stories & Acceptance Criteria

3.1 Core Features (Launch)

US-001: Skill Execution

As a professional using Legionis, I want to invoke any of the 80+ skills via /skill-name syntax in a chat interface, So that I can generate documents, analyses, and deliverables without using a CLI.

Acceptance Criteria:

Given I type /prd authentication feature in the chat, when the system processes my request, then a complete PRD is generated and saved to my workspace
Given I invoke any of the 80+ skills, when the skill completes, then the output matches the quality of the underlying skill definition
Given I invoke a skill, when the response streams, then I see tokens appearing in real-time (not waiting for full response)
Given I invoke a skill that produces a document, when it completes, then the document appears in my file explorer
Given I invoke a skill with Create/Update/Find modes, when the system detects the mode, then it behaves correctly (creates new, updates existing, or searches)

Performance: Skill execution p95 < 5 seconds

US-002: Agent Personas

As a professional, I want to interact with any of the 81 agents via @agent-name or /agent-name syntax, So that I can get domain-specific advice from specialists across every business function.

Acceptance Criteria:

Given I type @pm what do you think about this feature scope?, when the system responds, then Claude adopts the Product Manager persona (emoji, first person, conversational)
Given I am in an inline persona conversation, when I ask follow-up questions, then the persona maintains character across the conversation
Given I switch personas (@pm then @cfo), when the new persona responds, then the conversation context carries over but the persona changes
Given any of the 81 agents, when invoked, then each responds with correct identity (emoji + display name + first person voice)
Given I invoke a team gateway (@marketing, @finance), when the system processes, then it routes to the most appropriate specialist(s) for the request

Performance: Agent response p95 < 10 seconds

US-003: Context Layer Management

As a professional, I want to save and recall organizational context (decisions, bets, feedback, learnings), So that my decisions compound over time and I never lose institutional knowledge.

Acceptance Criteria:

Given I invoke /context-save with a decision, when it completes, then the decision is saved with proper ID format (DR-YYYY-NNN) and indexed
Given I invoke /context-recall pricing, when it searches, then all pricing-related decisions, bets, and learnings are returned
Given I have 50+ context entries, when I search, then results return in <2 seconds
Given I invoke a skill that produces a strategic document, when it completes, then the document is auto-registered in context/documents/index.md
Given I have context from 3 months ago, when I recall it, then all historical context is preserved and accessible

US-004: Document Intelligence

As a professional, I want to create, update, and find documents using natural language, So that I can manage my documentation efficiently.

Acceptance Criteria:

Given I type /prd authentication, when the system processes, then it creates a new PRD document (Create mode)
Given I type update the authentication PRD to add MFA, when the system detects Update mode, then it modifies the existing PRD
Given I type find all PRDs, when the system detects Find mode, then it lists all PRD documents in my workspace
Given a document is created, when I view it, then it renders as formatted markdown with syntax highlighting, tables, and diagrams

US-005: File Explorer

As a professional, I want to browse, organize, and manage my workspace files through a visual file explorer, So that I can navigate my documentation intuitively.

Acceptance Criteria:

Given I open my workspace, when the file explorer loads, then I see a tree view of my context folders and documents
Given I select a file, when it opens, then I see a formatted markdown preview with syntax highlighting
Given I want to organize files, when I drag-and-drop or right-click, then I can move, rename, and delete files
Given I create a document via skill, when it saves, then the file explorer updates in real-time

US-006: Authentication & Workspace

As a new user, I want to sign up, create a workspace, and start using agents immediately, So that I can evaluate the product with minimal friction.

Acceptance Criteria:

Given I visit the signup page, when I register with email or Google/Microsoft/LinkedIn OAuth, then my account is created and I'm logged in within 10 seconds
Given I'm a new user, when I first log in, then a workspace is auto-created with the context folder structure
Given I'm on the Free tier, when I invoke skills, then I have access to all 80+ skills and all 81 agents with a 500 operation/month cap (platform-provided Haiku)
Given I'm on a paid tier, when I first log in, then I'm prompted to connect my LLM API credentials
Given I approach my Free tier operation limit, when I reach 80% usage, then I see a clear notification with upgrade prompts

US-007: Billing & Subscription

As a professional evaluating Legionis, I want to upgrade from Free to a paid tier seamlessly, So that I can unlock full model access and team features when I see value.

Acceptance Criteria:

Given I'm on the Free tier, when I click "Upgrade", then I see a clear comparison of Free vs Pro vs Team vs Full Org
Given I choose Pro ($10/user/mo), when I complete payment, then I'm prompted to connect my API key
Given I choose Team ($7/user/mo), when I complete payment, then I can invite team members and manage seats
Given I choose Full Org ($25/user/mo), when I complete payment, then I can add API keys for multiple providers (Anthropic, OpenAI, Google) and access all modules
Given I want to add a module (+$5/user/mo), when I select the add-on, then the additional team is unlocked immediately
Given I want to cancel, when I visit billing settings, then I can cancel with a clear explanation of what I'll lose
Given I've been charged, when I visit billing, then I see usage breakdown and invoices

US-015: Connect Your AI (Your API Keys)

As a paid tier user, I want to connect my own LLM API credentials, So that I can use my existing API contracts, control costs directly, and maintain data governance.

Acceptance Criteria:

Given I'm on any paid tier, when I visit Settings -> Connect Your AI, then I see options to connect API keys
Given I'm on Pro or Team tier, when I add a key, then I can connect my Anthropic API key
Given I'm on Full Org tier, when I visit Settings -> API Keys, then I see options for multiple providers (Anthropic, OpenAI, Google Gemini)
Given I add an API key, when I save it, then the key is encrypted at rest and never displayed again (only last 4 characters shown)
Given I have valid API key credentials, when I invoke a skill or agent, then the system uses my API key
Given I view usage, when on a paid tier, then I see operation counts and token usage (I'm billed by my LLM provider directly)
Given my API key is invalid or rate-limited, when an operation fails, then I see a clear error explaining the issue
Given I want to remove a key, when I delete it, then the encrypted key is purged
Given I have multiple providers configured (Full Org), when I invoke an operation, then I can optionally specify which provider to use (default: Anthropic)

Security Requirements:

API keys encrypted with envelope encryption per Section 2.4.1
Keys never logged, never included in error messages
Key validation on save (test API call to verify credentials)
Audit log entry when keys are added, modified, or deleted

Tier Availability: Pro (Anthropic only), Team (Anthropic only), Full Org (multi-provider)

Performance: Key validation < 5 seconds

US-016: Team-Based Agent Selector

As a user exploring Legionis's capabilities, I want to browse all 11 teams and their agent members in a visual selector, So that I can discover which agents are available and route my request to the right specialist.

Acceptance Criteria:

Given I open the agent selector (Cmd+Shift+A), when it loads, then I see all 11 teams with agent counts
Given I click a team, when it expands, then I see all agents in that team with emoji, name, and one-line description
Given I click an agent, when selected, then @agent-name is prefilled in the input bar
Given I click a team gateway button, when selected, then @team-gateway is prefilled (e.g., @marketing)
Given I search in the selector, when I type keywords, then agents are filtered across all teams by name, role, or description
Given the selector is open, when I use arrow keys + Enter, then I can navigate and select without mouse

Performance: Agent selector load < 200ms

US-017: Token Usage Transparency

As a user on any tier, I want to see input/output token counts for each interaction, So that I can understand my usage and manage API costs (for paid tiers with BYOT).

Acceptance Criteria:

Given an agent completes a response, when I view the response, then I see a subtle token count indicator (input tokens / output tokens)
Given I'm on a paid tier with BYOT, when I view token counts, then I can estimate my LLM provider costs
Given I'm on the Free tier, when I view token counts, then I see operations used vs. 500 monthly cap
Given I want detailed usage, when I visit Settings -> Usage, then I see a breakdown by agent, skill, and date

US-018: Conversation Persistence

As a professional using Legionis daily, I want to save, load, list, rename, and delete conversations, So that I can return to previous work and organize my interactions.

Acceptance Criteria:

Given I have an active conversation, when I navigate away, then the conversation is auto-saved
Given I open the conversations panel, when it loads, then I see all my conversations sorted by last activity
Given I want to find a past conversation, when I search, then conversations are filtered by title, content, or agent used
Given I want to organize, when I rename or delete a conversation, then the change is reflected immediately
Given I load an old conversation, when I type a new message, then the conversation continues with full context preserved
Given I switch between conversations, when I open a different one, then the previous conversation state is preserved

US-019: Legionis Tokens (Managed Token Plan)

As a user who doesn't want to manage API keys, I want to use Legionis-provided AI tokens with transparent pricing, So that I can start using Legionis immediately without external account setup.

Acceptance Criteria:

Given I'm on any paid tier, when I visit Settings → AI Configuration, then I see a toggle: "Bring Your Own Keys" (default) vs "Use Legionis Tokens"
Given I select "Use Legionis Tokens", when I confirm, then all agent requests route through Legionis-provided API keys
Given I'm using Managed Tokens, when I invoke any agent, then my usage is metered (input/output tokens tracked per request)
Given I'm using Managed Tokens, when I view AI Configuration, then I see a spend dashboard: current period usage, daily breakdown, cost per agent
Given I'm using Managed Tokens, when my monthly invoice generates, then it includes platform subscription + token usage at provider rate + 30% service fee
Given I set a spending cap (optional), when my usage approaches the cap, then I receive a warning at 80% and hard stop at 100%
Given I want to switch to BYOT, when I add my own API key and select "Bring Your Own Keys", then future requests use my key with zero markup
Given I'm comparing options, when I view the token source selector, then I see a clear cost comparison ("Claude Sonnet: $3/$15 per MTok with BYOT vs $3.90/$19.50 with Legionis Tokens")

Billing: Stripe metered billing via Usage Records API. Daily aggregation of usage_events reported to Stripe.

US-020: Quality/Efficiency Model Routing

As a user who wants to control the cost-quality tradeoff, I want to set a global model quality preference, So that I can optimize for either the best output quality or the lowest cost.

Acceptance Criteria:

Given I visit Settings → AI Configuration, when I see the model routing section, then I see a 3-position segmented control: Efficiency / Balanced (selected by default) / Quality
Given I select "Maximum Quality", when I invoke any agent, then all agents use top-tier models (Opus/GPT-4o) regardless of their SKILL.md default
Given I select "Balanced", when I invoke any agent, then each agent uses its SKILL.md-specified model (sonnet for OS agents, opus for Extension Teams)
Given I select "Maximum Efficiency", when I invoke any agent, then all agents use fastest models (Haiku/GPT-4o-mini)
Given I change the setting, when I invoke an agent in the same session, then the new model is used immediately (no page refresh needed)
Given I'm on BYOT, when I view the quality toggle, then I see "Billed directly by your provider" below the cost indicator
Given I'm on Managed Tokens, when I view the quality toggle, then I see an estimated cost per 1,000 agent tasks that updates per position

Performance: Quality preference lookup adds < 1ms to request pipeline (single DB column read, cached per session).

US-021: External Tool Connections

As a professional using Legionis alongside other work tools, I want to connect my Jira, Slack, GitHub, and other tools so agents can use them, So that agents can perform real operations in my existing systems instead of producing manual action items.

Acceptance Criteria:

Given I visit Settings → Connections, when the page loads, then I see a marketplace-style grid of available integrations grouped by category
Given I click "Connect" on Jira, when the OAuth flow completes, then Jira shows as connected with a green status indicator
Given Jira is connected, when @pm creates a PRD with user stories, then the PM can create Jira tickets directly and report "I created PROJ-101, PROJ-102 in your backlog"
Given Jira is NOT connected, when @pm creates user stories, then the PM produces a table of stories with "Next Steps (Manual): Create these in Jira"
Given a tool is connected, when I view its card, then I see "Used by X agents" (e.g., "Used by 4 agents: PM, ProdOps, UX Lead, CI")
Given a connected tool's OAuth token expires, when an agent tries to use it, then I see "Your Jira connection has expired. Reconnect in Settings → Connections" and the agent falls back to text output
Given I want to disconnect a tool, when I click "Disconnect" and confirm, then OAuth tokens are purged and agents revert to text fallback
Given an unconnected tool would benefit the current task, when the agent detects this, then a subtle amber nudge appears: "I can create Jira tickets directly if you connect Jira in Settings"

Security: OAuth tokens encrypted with same envelope encryption as API keys. Minimum-viable scopes per service. All external API calls logged for audit.

Performance: OAuth flow round-trip < 10 seconds. Tool API calls < 5 seconds p95.

3.2 Agent Orchestration Features (Launch)

US-008: Agent Delegation

As a professional, I want to delegate work to agents using @agent syntax, So that agents can work autonomously (reading files, using skills) and return completed deliverables.

Acceptance Criteria:

Given I type @pm create a PRD for authentication based on @research.md, when the system processes, then the PM agent spawns, reads research.md from my workspace, creates a PRD, and returns a conversational response
Given an agent is working, when it reads or writes files, then I see tool call notifications in the chat (e.g., "Reading research.md...")
Given an agent completes, when it responds, then the response starts with the agent's emoji + display name and speaks in first person
Given an agent completes, when the response displays, then ROI is shown (e.g., "~4 hrs saved in 52s")
Given I type @cfo review our Q3 budget projections, when the CFO agent responds, then it provides financial perspective consistent with its persona

Performance: Agent delegation p95 < 15 seconds (includes file I/O + LLM calls)

US-009: Auto-Routing

As a professional, I want to ask questions without specifying which agent to use, So that the system routes my question to the right domain expert automatically.

Acceptance Criteria:

Given I ask "How should we price the enterprise tier?", when the system analyzes my question, then it routes to BizOps or VP Product (pricing domain)
Given I ask "Write user stories for the onboarding flow", when the system routes, then it spawns the PM agent
Given I ask "Review our employment contracts", when the system routes, then it spawns the Employment Counsel agent
Given auto-routing occurs, when the agent responds, then the routing decision is briefly explained ("Routing to BizOps for pricing analysis")

US-010: Interaction Logging

As a professional, I want to see a history of all agent interactions, skill invocations, and their outcomes, So that I can track what my AI workforce has done and reference past work.

Acceptance Criteria:

Given I've used agents and skills, when I view interaction history, then I see a chronological log with agent name, task, outcome summary, and ROI
Given I want to find past work, when I search interactions by topic, agent, or date, then relevant interactions are returned
Given an interaction references a document, when I click the reference, then it opens the document

3.3 Multi-Agent Orchestration Features (Launch)

US-011: PLT Sessions (Meeting Mode)

As a VP of Product, I want to invoke @plt should we prioritize webhooks or SDK? and get multi-perspective input from VP Product, Director PM, Director PMM, and BizOps, So that I can make strategic decisions with cross-functional input without scheduling a meeting.

Acceptance Criteria:

Given I invoke @plt, when the system processes, then it spawns 3-4 agents in parallel based on the topic
Given agents are working in parallel, when they respond, then I see each agent's response individually attributed with emoji + display name
Given all agents have responded, when Meeting Mode is assembled, then I see: individual perspectives -> points of alignment -> points of tension -> synthesis
Given a PLT session completes, when ROI is displayed, then it shows aggregate savings with per-agent breakdown
Given a PLT session, when agents respond, then each agent speaks in first person and maintains their domain perspective

Performance: PLT session p95 < 60 seconds (for 3-agent parallel session)

US-012: Team Gateways

As a professional, I want to invoke @marketing plan the product launch campaign or @finance build the revenue model and have the system route to the right specialists, So that complex cross-functional work is orchestrated automatically across any business function.

Acceptance Criteria:

Given I invoke any team gateway (@product, @marketing, @finance, @legal, @operations, @architecture, @design, @executive, @corpdev, @it), when the system analyzes the request, then it determines which agents from that team are needed
Given the gateway routes to multiple agents, when they respond, then results are presented in Meeting Mode format
Given the gateway determines only one agent is needed, when it routes, then it spawns a single agent (no Meeting Mode overhead)
Given I invoke @product launch the freemium tier, when it processes, then it routes to PM, PMM, and ProdOps based on RACI

US-013: Sub-Agent Spawning

As a PM agent working on a PRD, I want to spawn a UX Lead sub-agent for user research insights or a Security Architect for threat modeling, So that I can incorporate cross-domain expertise into my deliverables.

Acceptance Criteria:

Given an agent is working, when it determines it needs expertise from another domain, then it can spawn a sub-agent (including from other teams)
Given a sub-agent is spawned, when it responds, then the parent agent integrates the findings into its deliverable
Given sub-agent spawning, when depth exceeds 2, then further spawning is blocked (prevents recursion)
Given sub-agent spawning, when displayed to the user, then the consultation is attributed ("I consulted 🔐 Security Architect who noted...")

US-014: Meeting Mode Rendering

As a user viewing a multi-agent response, I want to see each agent's perspective clearly separated and attributed, So that I can understand individual viewpoints before reading the synthesis.

Acceptance Criteria:

Given a Meeting Mode response, when rendered, then each agent has a collapsible section with emoji + display name header
Given a Meeting Mode response, when I view it, then individual perspectives appear BEFORE synthesis (not after, not summarized)
Given Meeting Mode, when agents disagree, then "Points of Tension" section clearly shows the disagreement with attribution
Given Meeting Mode, when I want to drill deeper, then I can expand an agent's section to see their full reasoning

4. Non-Functional Requirements

4.1 Performance SLAs

Operation	Target (p95)	Maximum (p99)	Degraded	Blocker
Skill execution	<5s	<8s	>8s	>15s
Inline agent response	<10s	<15s	>15s	>30s
Agent delegation	<15s	<20s	>20s	>45s
PLT session (3 agents)	<60s	<90s	>90s	>120s
File explorer load	<1s	<2s	>2s	>5s
Document preview render	<500ms	<1s	>1s	>3s
Search results	<200ms	<500ms	>500ms	>2s
Auth (login/signup)	<3s	<5s	>5s	>10s

4.2 Security

Requirement	Implementation	Phase
Data isolation	Workspace-scoped storage paths, row-level security in PostgreSQL	M0
Encryption at rest	R2 default encryption (AES-256), Neon default encryption	M0
Encryption in transit	TLS 1.3 for all connections	M0
Authentication	Clerk with JWT validation on every API route	M0
Authorization	Tier-based access control (model gating, operation caps)	M0
RBAC	Clerk Organizations (Owner/Editor/Viewer)	Launch
SSO/SAML	Clerk Enterprise SSO	Enterprise tier
Audit logs	All user actions logged with timestamp, user, action, resource	Launch
Penetration testing	Annual third-party pen test	Post-launch
SOC 2 Type II	Vanta compliance automation	Enterprise tier

4.3 Scalability

Dimension	Launch	Growth (1K users)	Scale (5K users)
Concurrent users	50	500	2,000
Operations/day	500	5,000	20,000
Storage per workspace	50MB-500MB	500MB-5GB	5GB-50GB
Total storage	25GB	2.5TB	100TB
API requests/second	10	50	200
Database connections	20	50	100

4.4 Accessibility

Requirement	Standard	Phase
Keyboard navigation	WCAG 2.1 AA	Launch
Screen reader support	ARIA labels on all interactive elements	Launch
Color contrast	4.5:1 minimum (WCAG AA)	Launch
Focus indicators	Visible focus rings on all interactive elements	Launch
Responsive design	Functional on 768px+ viewport (tablet and desktop)	Launch
Mobile experience	Read-only document viewing on mobile	Post-launch

4.5 Reliability

Metric	Target
Uptime (monthly)	>99.5% (launch), >99.9% (scale)
Data durability	99.999999999% (11 nines, R2 default)
Recovery Point Objective (RPO)	<1 hour
Recovery Time Objective (RTO)	<4 hours
Backup frequency	Daily (database), continuous (object storage)

5. Success Metrics

Launch Metrics

Metric	Target	Red Flag
All 80+ skills functional	100%	<95%
All 81 agents functional	81/81	<81
All 11 gateways functional	11/11	<11
Skill execution time (p95)	<5s	>8s
Agent delegation time (p95)	<15s	>20s
PLT session time (p95)	<60s	>90s
Context save/recall reliability	>99%	<95%
Document generation success rate	>95%	<90%
Agent spawn success rate	>95%	<90%
Meeting Mode attribution accuracy	100%	<100%
Onboarding completion rate	>40%	<25%

Growth Metrics (6 months post-launch)

Metric	Target	Red Flag
Free-to-Pro conversion	>5%	<3%
Monthly churn (Pro tier)	<8%	>12%
NPS	>40	<30
Free users	2,000	<1,000
Pro users ($10/mo)	250	<100
Team users ($7/mo)	100	<30
Full Org users ($25/mo)	25	<10
MRR	$5K+	<$2K
Extension team usage (non-product)	>40% of sessions	<20%

Cost Metrics (Token Source Model)

Metric	Target	Red Flag
Free tier COGS per user	<$0.50/mo	>$1/mo
Infrastructure cost per paid user	<$5/mo	>$10/mo
BYOT: API key setup success rate	>95%	<85%
Managed Tokens: gross margin on token revenue	>25% (after Stripe fees)	<20%
Managed Tokens: average token spend per user	[TBD — monitor post-launch]	<$5/mo (too low to sustain)
BYOT vs Managed adoption split	60/40 BYOT-favored	<30% BYOT (undermines USP story)
Quality/Efficiency toggle adoption	>50% users change from Balanced	<10% (feature unused)

6. Risks & Mitigations

6.1 Overwhelming UX (80+ Skills, 81 Agents Day 1)

Risk: Users face 80+ skills and 81 agents on Day 1, feel overwhelmed, and churn before discovering value.

Probability: High | Impact: High

Mitigations:

Team-based organization: Agents organized into 11 teams — users browse teams, not a flat list of 81 agents

Guided onboarding: First-run wizard recommends a starting team based on user's role

Skill categories: Group skills by V2V phase with visual hierarchy (most-used prominent)

Smart suggestions: After first skill use, suggest next logical skill ("You wrote a PRD — want to add user stories?")

Searchable skill palette: Cmd+K palette with fuzzy search, descriptions, and examples

Progressive disclosure: Show "Starter Skills" by default, "All Skills" behind a toggle

Gateway-first approach: Encourage users to use @marketing or @finance — let the gateway route to the right agent

6.2 Token Source Adoption Risk

Risk: Users resist providing their own API keys, expecting the platform to handle LLM costs.

Probability: Medium | Impact: Medium (mitigated by Legionis Tokens)

Mitigations:

Legionis Tokens available from launch: Users who don't want to manage API keys can opt into Managed Tokens at 30% service fee — eliminates the friction entirely

Clear value proposition for BYOT: Emphasize benefits — existing contracts, volume discounts, zero markup, full cost control

Easy onboarding: Two-card choice during onboarding: "Bring Your Own Keys" (recommended) vs "Use Legionis Tokens" (simpler)

Free tier exists: Users can try with platform-provided Haiku before committing

Transparent comparison: Show side-by-side cost for BYOT vs Managed (e.g., "Claude Sonnet: $3/$15 per MTok with BYOT vs $3.90/$19.50 with Legionis Tokens")

Switchable: Users can switch between BYOT and Managed at any time in Settings → AI Configuration

Quality/Efficiency toggle: Available for both token sources — gives users direct control over spend regardless of token source

6.3 Agent Orchestrator Complexity

Risk: Building the orchestrator (especially parallel spawning + Meeting Mode) takes longer than planned and delays launch.

Probability: High | Impact: High

Mitigations:

Proven architecture: Orchestrator design validated in CLI plugin — port, don't reinvent

Sequential fallback: If parallel spawning has issues, run agents sequentially (slower but functional)

Max 4 parallel agents: Hard limit prevents API rate limit issues

Depth limit: Sub-agent spawning limited to depth 2 to prevent runaway complexity

Timeboxed debugging: If orchestrator issues persist after 2 weeks, launch with sequential-only and iterate

6.4 LLM Rate Limits

Risk: LLM API rate limits prevent concurrent agent spawning, especially for PLT sessions with 3-4 parallel agents.

Probability: Medium | Impact: High

Mitigations:

Request queuing: Queue agent spawns with backoff if rate limited

Staggered spawning: Stagger parallel agent starts by 500ms to avoid burst

API tier upgrade: Upgrade API tier at scale (negotiate with provider sales)

Model fallback: If Opus rate limited, fall back to Sonnet for that agent

Per-user concurrency limit: Max 1 PLT session at a time per user

6.5 Competition from AI Platforms

Risk: Anthropic, OpenAI, or others build similar multi-agent workforce features directly into their platforms.

Probability: Low-Medium | Impact: High

Mitigations:

Methodology is the moat: We're not wrapping an LLM — we're embedding a workforce methodology. The 81 specialized agents, team structures, and domain-specific knowledge packs are our IP

Context accumulation: Organizational memory compounds over months. Generic AI platforms don't persist context across sessions

Multi-agent orchestration: 11 team gateways with Meeting Mode is architectural complexity generic platforms are unlikely to build

ROI tracking: Built-in time-savings tracking creates self-justifying value — unique to our product

Speed to market: First-mover advantage in the AI workforce platform space

6.6 Free Tier Abuse

Risk: Users create multiple free accounts to avoid paying, or bots abuse free tier operations.

Probability: Medium | Impact: Medium

Mitigations:

Rate limiting: Max 50 operations/day on Free tier (prevents burst abuse)

Email verification: Require email verification before first skill invocation

Device fingerprinting: Detect multiple accounts from same device/IP

Captcha on signup: reCAPTCHA v3 for automated bot prevention

7-day context retention: Free tier context expires after 7 days, discouraging long-term free usage

7. Open Questions & Dependencies

Open Questions

#	Question	Owner	Needed By	Impact if Unresolved
OQ-001	~~Should we build the POC in-house or partner with a dev shop?~~ RESOLVED: Building in-house (Yohay + Claude Code + Cursor)	VP Product	Pre-launch	Resolved
OQ-002	What is the beta strategy — closed or open beta?	Dir PMM	Pre-launch	Affects user acquisition plan
OQ-003	~~How do we handle CLI plugin updates — does the SaaS automatically get plugin updates?~~ RESOLVED: Git submodules (product-org-os + extension-teams) with compile step	Dir PM	Pre-launch	Resolved
OQ-004	What happens when a user exceeds fair use on paid tiers?	BizOps	Post-launch	Revenue vs. UX tradeoff
OQ-005	Should we support offline/local mode for Enterprise customers?	VP Product	Enterprise tier	Enterprise deal blocker
OQ-006	~~How do we handle prompt injection attacks in user-provided context files?~~ RESOLVED: See Section 2.4.3	Security Architect	Pre-launch	Resolved
OQ-007	~~What is the Team tier pricing and when does it ship?~~ RESOLVED: $7/user/mo at launch	BizOps	Pre-launch	Resolved
OQ-008	Which extension team modules are included in base Pro vs. require add-on?	BizOps	Pre-launch	Affects perceived value
OQ-009	~~Should we offer platform-provided tokens as an alternative to BYOT?~~ RESOLVED: Yes — Legionis Tokens at 30% markup as opt-in convenience option. BYOT remains default.	PM	Pre-launch	Resolved
OQ-010	~~How should users control model quality vs cost?~~ RESOLVED: 3-tier global toggle (Quality/Balanced/Efficiency). Per-agent override in Week 15.	PM	Week 5	Resolved
OQ-011	~~How do tool integrations work in the SaaS product (MCP vs OAuth vs hybrid)?~~ RESOLVED: Hybrid OAuth — standard OAuth flows, direct API calls, OS templates inform operations. See Section 2.10.	PM	Post-launch	Resolved
OQ-012	Metered billing vs pre-paid credits for Legionis Tokens?	BizOps	Week 5	Affects UX and cash flow. Recommendation: metered billing for MVP (Stripe native), pre-paid credits post-launch.
OQ-013	Which integrations should be available at launch vs post-launch?	Dir PM	Pre-launch	Current plan: all post-launch (Weeks 11-14). Graceful fallback works without any.

External Dependencies

Dependency	Provider	Risk Level	Mitigation
Claude API availability	Anthropic	Medium	Retry logic, graceful degradation, status page monitoring
Claude API pricing changes	Anthropic	Medium	Model routing flexibility, multi-model support ready via AI SDK
Clerk service availability	Clerk	Low	Session caching, JWT validation works offline for 24h
Stripe webhook reliability	Stripe	Low	Idempotent webhook handlers, manual reconciliation fallback
R2 storage availability	Cloudflare	Low	Multi-region replication, local cache for hot files
Typesense Cloud availability	Typesense	Low	PostgreSQL full-text search as fallback
Vercel function limits	Vercel	Low	Monitor function duration limits (60s on Pro); upgrade to Enterprise if PLT sessions hit ceiling

Internal Dependencies

Dependency	Owner	Needed By	Status
Product Org OS plugin v3 stable release	Plugin team	M0	Complete
Skill template finalization (80+ skills)	Dir PM	M0	Complete
Agent persona files (81 agents)	Dir PM	M0	Complete
Rule files finalized	Dir PM	M0	Complete
Extension Teams (68 agents, 9 teams)	Dir PM	M0	Complete
Compiled persona JSON	Build script	M0	Complete (compile-prompts.ts)
Brand guidelines and design system	Design	M0	Complete (stone + amber palette)
GTM launch plan	Dir PMM	Pre-launch	In progress

8. Launch Feature Summary

Full Capability from Day 1

All features available at launch — no phased rollout:

Core Platform:

Cloud Storage Integration: Google Drive, OneDrive, Dropbox OAuth + workspace initialization
CLI-Style Input: Text command bar with /skill and @agent syntax, autocomplete
Voice Input: Push-to-talk voice activation for skill/agent invocation
All 80+ skills functional via /skill-name
All 81 agents across 11 teams via @agent-name
9 team gateways (@marketing, @finance, @legal, @operations, @architecture, @design, @executive, @corpdev, @it) + 2 product gateways (@product, @plt)
Context layer (save, recall, feedback, documents) — stored in user's cloud storage
Document intelligence (Create/Update/Find) with stable file IDs
File explorer showing cloud storage contents with markdown preview
Team-based agent selector UI (browse 11 teams, see members)
Token usage transparency (input/output per interaction)
Conversation persistence (save, load, list, rename, delete)

Agent Orchestration:

@agent syntax triggers autonomous agent spawning
Agent prompt construction (identity protocol + persona + context)
Parallel agent spawning for multi-agent sessions
Meeting Mode presentation (individual voices -> alignment -> tension -> synthesis)
PLT gateway (@plt with 3-4 agent composition)
Product gateway (@product with RACI-based routing)
9 team gateways with domain-specific routing
Sub-agent spawning (depth limit: 2, cross-team supported)
Auto-routing (domain-based agent selection across all 81 agents)
Interaction logging (JSON index + daily logs)
ROI aggregation (per-agent + session totals)

Business Infrastructure:

Auth (Clerk — Google, Microsoft, LinkedIn, email/password) + Billing (Stripe)
Token Source Choice: BYOT (default, zero markup) or Legionis Tokens (30% service fee, no API key needed)
Quality/Efficiency Toggle: 3-position global model routing (Maximum Quality / Balanced / Maximum Efficiency)
Free: $0, 500 ops/month, platform Haiku, all agents accessible
Pro: $10/user/mo, BYOT or Managed Tokens, all teams
Team: $7/user/mo, BYOT or Managed Tokens, shared workspaces
Add-on Module: +$5/user/mo, premium team modules
Full Org: $25/user/mo, BYOT multi-provider or Managed Tokens, everything included
Object storage (R2) for app data + Search (Typesense)

External Tool Integrations (post-launch, Weeks 11-14):

15 integration categories covering ~47 platforms across all 9 teams
Wave 1: Jira, Slack, GitHub, Linear (Project Mgmt + Communication + Code)
Wave 2: Figma, GA4, email platforms, Search Console (Design + Marketing)
Wave 3: Xero/QBO, Stripe, DocuSign, Vanta, ServiceNow, Okta, Salesforce, PitchBook, Notion, Miro (Finance + Legal + IT + Corp Dev + Operations)
Graceful fallback: agents work without any connections; connections make them more powerful

Performance Targets:

Skill execution: <5s (p95)
Agent delegation: <15s (p95)
PLT session (3 agents): <60s (p95)

9. Launch Readiness Checklist

Architecture Readiness (Pre-Development)

MUST be true before development begins:

✓ API key encryption implemented (Section 2.4.1)

✓ RLS policies enforced on all workspace-scoped tables (Section 2.4.2)

✓ Prompt injection mitigation strategy implemented (Section 2.4.3)

✓ Connection pool configuration completed (Section 2.4.4)

☐ API versioning strategy documented (Section 2.4.5)

✓ SSE event type contract specified (Section 2.4.6)

☐ Prompt versioning system designed (Section 2.4.7)
☐ Distributed tracing implemented (Section 2.4.8)

Gate Owner: Chief Architect

Launch Readiness

MUST be true to launch:

☐ All 80+ skills execute successfully
☐ All 81 agents respond in character
☐ All 11 gateways route correctly
☐ @agent syntax triggers successful spawning
☐ @plt triggers multi-agent Meeting Mode correctly
☐ Team gateways (@marketing, @finance, etc.) route to correct specialists
☐ Individual agent voices preserved (no summary flattening)
☐ Agents can use skills and read/write files
☐ Auto-routing selects correct agent >85% of the time
☐ Context layer CRUD operations work reliably
☐ Document generation produces valid markdown/HTML
☐ API key setup and validation working
☐ ROI displayed correctly after agent responses
☐ Interaction logging captures all invocations
☐ Token usage shown per interaction
☐ Conversation persistence (save/load/list/rename/delete) working
☐ Team selector UI functional with all 11 teams
☐ Performance targets met (p95 latency)
☐ User testing shows >4/5 satisfaction
☐ No P0 bugs

Gate Owner: Director of Product Management

10. Appendix

A. Technology Stack Summary

Layer	Technology	Version
Frontend Framework	Next.js (React 18+)	16
Language	TypeScript	5.x (strict)
AI SDK	Vercel AI SDK	v6
Styling	Tailwind CSS	4.x
Components	Radix UI (headless)	Latest
State (client)	Zustand	5.x
State (server)	TanStack Query	5.x
Markdown	react-markdown + rehype/remark	Latest
Editor	Tiptap (ProseMirror)	2.x
Streaming	Vercel AI SDK (SSE under the hood)	Native
Runtime	Node.js	22 LTS
API Layer	Next.js API Routes	(integrated)
API Spec	OpenAPI	3.1
Database	PostgreSQL (Neon)	17
ORM	Drizzle	Latest
Object Storage	Cloudflare R2	S3-compat
Search	Typesense Cloud	Latest
Auth	Clerk	Latest
Payments	Stripe Billing	Latest
LLM	Anthropic / OpenAI / Google (via AI SDK)	Multi-provider
Hosting	Vercel	Pro
Error Tracking	Sentry	Latest
Analytics	PostHog	Latest
Logs/Uptime	Better Stack	Latest

B. Glossary

Term	Definition
V2V	Vision to Value — the 6-phase operating system for product organizations
Skill	A templated AI capability (e.g., `/prd`, `/decision-record`)
Agent	An AI persona with domain expertise (e.g., PM, CFO, CMO)
Team	A group of related agents organized by business function (e.g., Marketing Team, Finance Team)
Gateway	A multi-agent routing protocol (`@product`, `@plt`, `@marketing`, `@finance`, etc.)
Meeting Mode	Presentation format showing individual agent voices before synthesis
Context Layer	Persistent organizational memory (decisions, bets, feedback, learnings)
Operation	A single skill invocation, agent spawn, or gateway session
BYOT	Bring Your Own Tokens — users connect their own LLM API keys
COGS	Cost of Goods Sold — primarily LLM API costs (zero for paid tiers with BYOT)
Prompt Assembler	Server-side component that compiles LLM prompts from skill templates, rules, and context
Agent Orchestrator	Server-side component that manages agent spawning, tool execution, and response synthesis
Extension Team	Specialist agent teams beyond the core Product Org (Design, Marketing, Finance, etc.)

Document Status: Active Next Review: Pre-launch Gate Escalation: If any gate fails twice, escalate to VP Product for strategic review

11. Change Log

Version	Date	Author	Changes
1.0	2026-01-28	Dir PM	Initial PRD release
1.1	2026-01-29	Dir PM	Added Section 2.4 (P0 Architecture Requirements) based on Architecture Team review. Added Pre-M0 Gate. Resolved OQ-006. Renumbered sections 2.5->2.6, 2.6->2.7.
1.2	2026-01-29	Dir PM	Refactored Section 2.4 to requirements-only format (implementation details moved to `architecture-spec-m0.md`). Added US-015 (Bring Your Own Tokens) for Phase 1.
2.0	2026-01-30	Dir PM	Major strategic update: (1) Reframed target user as power users, not non-CLI users. (2) Added cloud storage integration as core feature (Section 2.8). (3) Added user journey (Section 2.9). (4) Added input methods / CLI-style UX (Section 2.10). (5) Updated infrastructure costs with verified pricing. (6) Updated COGS estimate. (7) Added assumptions A-006, A-007.
2.1	2026-01-30	Dir PM	Simplified structure: (1) Removed milestone references — full capability at launch. (2) Changed to Your API Keys model. (3) Collapsed phased delivery into single launch. (4) Updated cost model to reflect Your API Keys.
3.0	2026-02-17	Dir PM	Positioning & architecture update: (1) Reframed from "Product Org OS web platform" to general-purpose AI workforce platform. (2) Expanded agent roster from 13 to 81 agents across 11 teams with 9 team gateways. (3) Updated pricing to team-based model: Pro $10/user, Team $7/user, Add-on +$5/user, Full Org $25/user. (4) Eliminated separate Hono backend — all API logic runs as Next.js API routes on Vercel. (5) Updated Next.js 14 -> 16, added Vercel AI SDK v6 throughout. (6) Removed Railway from infrastructure (~$50-200/mo savings). (7) Added 4 new target personas (Startup Founder, Marketing Lead, Finance Director, Operations Manager). (8) Added Section 2.5 (Agent Roster) with full 81-agent breakdown by team. (9) Added Section 2.6 (Design System & Brand) with stone/amber palette. (10) Added US-016 (Team Selector), US-017 (Token Transparency), US-018 (Conversation Persistence). (11) Updated all success metrics, launch checklist, and feature summary to reflect 81 agents / 11 gateways / 80+ skills. (12) Resolved OQ-001, OQ-003, OQ-007.