Documentation

Install once.
Every AI tool gets CogOS.

Run cog init and CogOS registers as an MCP server. Your AI tool discovers CogOS and gets instant access to 55 domain expert modules. No API key needed — the AI tool IS the LLM, CogOS just makes it smarter.

MCP Integration — Automatic Setup

CogOS registers itself as an MCP (Model Context Protocol) server. This means your AI tool discovers CogOS automatically — no config files to edit, no instructions to paste, no existing files overwritten.

# Install and register — one command $ pip install -e . $ cog init Registered CogOS MCP server with: - claude (~/.claude.json) - codex (~/.codex/config.toml) - gemini (~/.gemini/settings.json) CogOS is ready. Your AI tool will discover it automatically via MCP.

After registration, your AI tool sees these CogOS tools:

MCP Tool What it does
cog_run Returns relevant expert knowledge from matching modules. The AI uses this expertise to complete the task correctly.
cog_chat Ask follow-up questions about domain topics
cog_status Show active modules, tools, and provider info
cog_modules List available domain modules and capabilities

Supported AI tools

Tool How it discovers CogOS Config location
Claude Code Auto — registers in ~/.claude.json ~/.claude.json
OpenAI Codex CLI Auto — registers in ~/.codex/config.toml ~/.codex/config.toml
Gemini CLI Auto — registers in ~/.gemini/settings.json ~/.gemini/settings.json
opencode Auto — registers in config, reads provider ~/.config/opencode/opencode.json
Cursor Auto — creates .cursor/mcp.json .cursor/mcp.json
VS Code / Cline / Roo Auto — creates .vscode/mcp.json .vscode/mcp.json
Goose Auto — registers in config ~/.config/goose/config.yaml

Manual registration: Run cog register at any time to (re)register with all detected AI tools. Or add manually: command is python -m cog.mcp_server via stdio transport.

AGENTS.md — Auto-Instructions

When you run cog init, CogOS also writes a small instruction block to your project's AGENTS.md file. This tells AI tools (Claude Code, Codex, etc.) how to use CogOS without reading its source code.

How it works

  • No AGENTS.md exists — CogOS creates one with just the CogOS instruction block
  • AGENTS.md already exists — CogOS prepends its block above your existing content (nothing is overwritten)
  • AGENTS.md already has CogOS block — CogOS skips it (idempotent — safe to run cog init multiple times)
# First run — creates AGENTS.md $ cog init Created AGENTS.md # Second run — skips (already has CogOS block) $ cog init AGENTS.md already has CogOS section (skipped) # With existing AGENTS.md that has other content $ cog init Prepended CogOS info to AGENTS.md

The CogOS block is wrapped in <!-- cogos:start --> / <!-- cogos:end --> markers so it can be detected and updated independently. Your existing project instructions, framework configs, or any other content in AGENTS.md is never touched.

What the block contains

The prepended block tells AI tools three things:

  1. What CogOS is — a modular cognitive runtime with modules, tools, and multi-agent orchestration
  2. How to use it — call cog_run instead of reading CogOS source code
  3. Available tools — a table of cog_run, cog_chat, cog_status, cog_modules

Why this matters: Without AGENTS.md instructions, AI tools default to reading CogOS source code (~2500 lines) to understand how it works. The AGENTS.md block short-circuits that — they see the instructions and immediately call cog_run instead. Faster, cheaper, and more reliable.

How It Works — Zero Config

CogOS uses a simple but powerful architecture: the AI tool is the LLM. When Claude Code, Cursor, or any MCP-compatible AI calls cog_run(), CogOS finds relevant expert modules and returns their knowledge. The AI then uses that expertise to complete the task.

  1. You run cog init — Registers CogOS as MCP server with your AI tools. Auto-detects your LLM provider from environment, opencode config, or Ollama.
  2. Your AI calls cog_run(task) — CogOS queries a chunk-level index built from all 7,275+ prompt extensions across 55 modules.
  3. CogOS returns targeted expertise — The highest-scoring individual knowledge chunks are returned, capped to ~1,500 tokens, with session deduplication so the AI never receives the same knowledge twice.
  4. Your AI completes the task — Using the expert context, the AI executes with deep domain knowledge it didn't have before — without burning through your token budget.

No API key needed for MCP usage. The AI tool is already authenticated and running. CogOS enriches it with expertise. Provider config is only needed for standalone CLI usage.

Token Efficiency — Built In

CogOS holds 7,275+ prompt extensions across 55 modules. Without safeguards, a single cog_run call could push 10,000–15,000 tokens of expertise into your context window before your AI writes a single line of code. That would make CogOS a liability, not an asset.

Three mechanisms work together automatically to prevent this. No configuration is required — they are active from the moment you install CogOS.

1. Chunk-level indexing

The old approach returned every extension from the top-matching modules as a block dump. The current approach treats each of the 7,275+ extensions as an individually scored document. When your AI calls cog_run("build a REST API with FastAPI"), CogOS scores every extension in the library against that specific task and returns only the highest-relevance ones — regardless of which module they came from.

A typical call now delivers ~1,500 tokens of precisely targeted knowledge instead of a broad module dump. The AI gets what it actually needs, not everything that's loosely related.

2. Session deduplication

Every chunk of expertise returned to your AI is fingerprinted with a content hash. On every subsequent cog_run or cog_chat call in the same session, already-returned chunks are excluded from the results. The response includes a chunks_skipped_dedup field so your AI knows its prior context is still valid and can build on it.

This compounds throughout a session. By the third or fourth call on the same project, CogOS is only returning knowledge that is genuinely new — not re-injecting context the AI already has. The AI's context window grows in value without growing proportionally in size.

3. Character budget

Total expertise per call is capped at a configurable character limit. The default is 6,000 characters (~1,500 tokens). Chunks are collected greedily in relevance order — the most important content first — so the budget cut always removes the lowest-value tail, not random content.

The budget is tunable for projects that genuinely need richer context per call:

# cog.yaml max_expertise_chars: 6000 # default — ~1,500 tokens per call # Or via environment variable export COG_MAX_EXPERTISE_CHARS=10000

What the AI sees

Every cog_run response includes metadata so your AI can reason about what it received and what's been deduplicated:

{ "chunks_returned": 12, "chunks_skipped_dedup": 8, "total_chars": 5840, "modules_contributing": [ { "name": "cog-code-python", "chunks": 7 }, { "name": "cog-infra-docker", "chunks": 5 } ], "expertise": "..." }

When all relevant chunks for a task have already been returned this session, CogOS says so explicitly rather than returning an empty response — the AI knows to proceed using the expertise already in its context.

The goal: CogOS should make your AI measurably better at complex technical tasks — without meaningfully increasing your token bill. These three mechanisms are our commitment to that. If you observe unexpected token spikes, run cog status to see session chunk counts and current budget settings.

Host AI Mode — The Default

By default, CogOS uses host AI mode: the AI tool you are already running is the LLM. CogOS does not need its own model, API key, or separate configuration.

This is the setting you get out of the box. In cog.yaml it looks like this:

provider: host # use whatever AI is already running cog

When your AI calls cog_run(), CogOS finds the relevant expert modules and returns their knowledge as context. Your AI — Claude Code, Cursor, Gemini CLI, opencode, or any other MCP-compatible tool — then uses that context to complete the task. No second API key. No second model. No extra cost.

You never need to touch cog.yaml for this to work. Host AI mode is automatic. The only reason to create a cog.yaml is to point specific internal agents at a cheaper model — which is entirely optional.

Per-Agent Model Configuration — Optional Cost Savings

CogOS has several internal agents that work on different parts of a task. Some of them — like the planner that breaks a task into steps, or the document writer that generates summaries — don't need your premium model. You can point them at a smaller, cheaper model while your main AI handles the actual execution.

Create a cog.yaml in your project root and add an agents: block:

# provider: host = use my current AI (the default) provider: host # Per-agent overrides — each role can use its own model agents: planner: provider: openai model: gpt-4o-mini # cheap model for task decomposition api_key: YOUR_KEY_HERE # base_url: optional, for custom / self-hosted endpoints document_writer: provider: openai model: gpt-4o-mini # cheap model for doc generation api_key: YOUR_KEY_HERE # executor — writes and runs code # executor: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE # researcher — web search and analysis # researcher: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # coder — writes implementation code # coder: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE # reviewer — reviews and critiques code # reviewer: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # critic — finds flaws and improvements # critic: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # tester — tests and validates output # tester: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # documenter — writes inline documentation # documenter: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # optimizer — optimizes performance # optimizer: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # security — security analysis # security: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # architect — system design decisions # architect: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE memory_backend: sqlite modules_path: modules memory_path: cog_memory.db log_level: INFO max_agent_iterations: 20

Any agent role you leave out inherits the global provider. With provider: host, that means your current AI handles it at no extra cost.

Known agent roles

Role What it does Recommendation
planner Decomposes tasks into ordered steps Small model — short structured prompts
document_writer Generates docs, summaries, and reports Small model — templated output
executor Writes and runs code, calls tools Leave unset — use your best model (host AI)
researcher Web search and multi-source analysis Small model or leave unset
coder Writes implementation code Leave unset — use your best model
reviewer Reviews and critiques code Small model or leave unset
critic Finds flaws and suggests improvements Small model or leave unset
tester Tests and validates output Small model or leave unset
documenter Writes inline documentation Small model — templated output
optimizer Optimizes performance Small model or leave unset
security Security analysis and review Small model or leave unset
architect System design decisions Leave unset — use your best model

Works with any OpenAI-compatible endpoint. Set base_url on any agent to point at DeepSeek, Zhipu/GLM, OpenRouter, Ollama, LM Studio, or any self-hosted model — mix and match per role.

Advanced: Standalone Provider Configuration

The following options are for CLI usage (cog run, cog chat) or for embedding CogOS as a library. For MCP usage — Claude Code, Cursor, Gemini CLI, etc. — host AI mode handles everything automatically.

There are four ways to configure a standalone provider:

  1. Pass-through — hand an existing provider object to CogOS
  2. String-based — provide model name + API key to the constructor
  3. Environment variables — CogOS reads standard env vars
  4. Config file — set agents.executor in cog.yaml

Key principle: CogOS never stores or transmits your API keys. They live in your environment, your config file (which is gitignored), or your host tool. CogOS just uses them to make LLM calls on your behalf.

Pass-Through Mode (Recommended)

When you're already using an AI tool (Claude Code, Codex CLI, Gemini CLI, etc.), that tool already has an LLM client configured. Instead of duplicating configuration, just pass the provider directly:

from cog import CogOS from cog.providers.openai_provider import OpenAIProvider # Your host tool already built this provider provider = OpenAIProvider(model="gpt-4o", api_key="sk-...") # Hand it to CogOS — no second config needed cog = CogOS(provider=provider) result = cog.run("Analyze this codebase for security issues") print(result["output"])

With pass-through mode:

  • No API keys are duplicated or stored by CogOS
  • The host tool's provider, model, and base_url are used directly
  • Works with any provider that implements the LLMProvider interface
  • Zero additional configuration required

Ideal for: AI tools that embed CogOS as a library. The host tool manages the LLM connection; CogOS provides the cognitive runtime (modules, agents, tools, memory).

For MCP usage, pass-through happens automatically — the AI tool IS the LLM.

String-Based Initialization

For standalone scripts or quick prototyping, provide the model name and API key directly:

from cog import CogOS # OpenAI cog = CogOS(llm="gpt-4o", api_key="sk-...") # Anthropic (auto-detected from model name) cog = CogOS(llm="claude-sonnet-4-20250514", api_key="sk-ant-...") # Any OpenAI-compatible endpoint cog = CogOS(llm="glm-4", api_key="...", base_url="https://open.bigmodel.cn/api/paas/v4")

The provider is auto-detected from the model name:

  • Model contains "claude" → Anthropic provider
  • Everything else → OpenAI-compatible provider (works with any endpoint via base_url)

Environment Variables

Set standard environment variables and CogOS will detect them automatically. This is the easiest way to configure CogOS without modifying code:

# CogOS-specific (highest priority) export COG_PROVIDER=openai export COG_MODEL=gpt-4o export COG_API_KEY=sk-... export COG_BASE_URL=https://api.openai.com/v1 # optional # Or use standard provider env vars (auto-detected) export OPENAI_API_KEY=sk-... # → provider=openai export OPENAI_BASE_URL=https://... # → custom endpoint export ANTHROPIC_API_KEY=sk-ant-... # → provider=anthropic

Use CogOS.from_env() to create an instance from environment variables:

from cog import CogOS cog = CogOS.from_env() result = cog.run("Deploy this to AWS")

Resolution order

When multiple sources are set, CogOS uses this priority (highest wins):

  1. COG_* environment variables
  2. OPENAI_* / ANTHROPIC_* environment variables
  3. cog.yaml config file (walks up from CWD)
  4. Hardcoded defaults (all None — no silent fallback)

Config File (cog.yaml)

For MCP usage (Claude Code, Cursor, Gemini CLI, etc.), no config file is needed at all. The AI tool you're already running IS the LLM — CogOS uses it automatically in host AI mode. cog.yaml is only needed if you want to assign specific agents to cheaper models, or for standalone CLI usage.

The default cog.yaml generated by cog init reflects the host AI default and the per-agent pattern:

# cog.yaml — never committed to git # "host" means: use whatever AI is already running cog provider: host # Optional: point specific agents at cheaper models agents: planner: provider: openai model: gpt-4o-mini api_key: YOUR_KEY_HERE document_writer: provider: openai model: gpt-4o-mini api_key: YOUR_KEY_HERE # executor: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE # researcher: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # coder: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE # reviewer: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # critic: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # tester: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # documenter: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # optimizer: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # security: # provider: openai # model: gpt-4o-mini # api_key: YOUR_KEY_HERE # architect: # provider: openai # model: gpt-4o # api_key: YOUR_KEY_HERE memory_backend: sqlite modules_path: modules memory_path: cog_memory.db log_level: INFO max_agent_iterations: 20

Generate one with cog init:

$ cog init Created cog.yaml with host AI mode defaults. Edit agents: to assign specific roles to cheaper models.

Security: cog.yaml is listed in .gitignore by default. Never commit this file if it contains API keys. For team projects, use environment variables or a secrets manager instead.

CLI Usage

The CLI reads from the config file and environment variables automatically:

# Create config $ cog init # Run a task $ cog run "Create a REST API with authentication" # Override provider/model per command $ cog run "Debug this function" --provider openai --model gpt-4o # Interactive chat $ cog chat # Show status (provider, model, modules, tools) $ cog status

Both cog and cogos commands work identically.

Creating Custom Modules

Use cog create to scaffold a new module with the correct structure:

# Scaffold a new module $ cog create infra-pulumi --description "Pulumi infrastructure as code" Created modules/cog-infra-pulumi/ modules/cog-infra-pulumi/manifest.json modules/cog-infra-pulumi/module.py # Or use the full cog- prefix $ cog create cog-lang-rust --description "Rust language module"

This generates two files:

manifest.json — module metadata

{ "name": "cog-infra-pulumi", "version": "0.1.0", "description": "Pulumi infrastructure as code", "capabilities": ["pulumi_operations"], "requires": [], "permissions": ["shell.execute"], "entrypoint": "module.py" }

module.py — your tools, verifiers, and domain knowledge

The scaffolded module.py includes a working Tool, Verifier, and CogModule class with TODO comments. Each module can provide:

Component What it does Method
Tools Actions CogOS can take (run commands, call APIs, etc.) register_tools()
Verifiers Health checks (is the tool installed? are creds valid?) register_verifiers()
Prompt extensions Domain expertise injected into the LLM context get_prompt_extensions()
Capabilities Tags for task routing (e.g. s3_operations) get_capabilities()
Lifecycle hooks Setup/teardown logic (on_load, pre_execute, etc.) Override on CogModule

Module naming convention

Module names follow the pattern cog-{category}-{topic}:

# Category examples cog-cloud-aws # cloud providers cog-lang-rust # programming languages cog-infra-docker # infrastructure / DevOps cog-db-postgres # databases cog-framework-nextjs # web frameworks cog-testing-playwright # testing tools

Sharing modules

Modules are just directories with manifest.json + module.py. To share:

  1. Git repo — push your module directory, users clone into their modules/
  2. Pip package — wrap the module in a Python package with a modules/ entry point
  3. Registry — publish to a CogOS module registry (coming soon)

Tip: The requires field in manifest.json lets you declare dependencies on other modules (e.g. "requires": ["tool-core"]). CogOS resolves activation order automatically.

Supported Providers

Two provider classes are included. The OpenAI provider works with any OpenAI-compatible endpoint via the base_url parameter:

Provider Class Env vars Notes
OpenAI OpenAIProvider OPENAI_API_KEY GPT-4o, GPT-4, etc.
Anthropic AnthropicProvider ANTHROPIC_API_KEY Claude Sonnet, Haiku, etc.
DeepSeek OpenAIProvider COG_API_KEY + COG_BASE_URL OpenAI-compatible endpoint
Zhipu / GLM OpenAIProvider COG_API_KEY + COG_BASE_URL OpenAI-compatible endpoint
OpenRouter OpenAIProvider COG_API_KEY + COG_BASE_URL OpenAI-compatible endpoint
Local (Ollama, etc.) OpenAIProvider COG_BASE_URL base_url=http://localhost:11434/v1
Any custom LLMProvider subclass Custom Implement complete() and get_model_name()

Integrating With AI Tools

CogOS is designed to be embedded by other AI tools. Here's how different tools would integrate:

Claude Code / Anthropic tools

from cog import CogOS from cog.providers.anthropic_provider import AnthropicProvider provider = AnthropicProvider(model="claude-sonnet-4-20250514", api_key="sk-ant-...") cog = CogOS(provider=provider)

OpenAI Codex / GPT tools

from cog import CogOS from cog.providers.openai_provider import OpenAIProvider provider = OpenAIProvider(model="gpt-4o", api_key="sk-...") cog = CogOS(provider=provider)

Custom / OpenAI-compatible endpoints

from cog import CogOS from cog.providers.openai_provider import OpenAIProvider # Works with DeepSeek, Zhipu, OpenRouter, Ollama, vLLM, etc. provider = OpenAIProvider( model="deepseek-chat", api_key="...", base_url="https://api.deepseek.com/v1", ) cog = CogOS(provider=provider)

Custom provider class

Implement the LLMProvider interface for any LLM not covered by the built-in providers:

from cog.providers.base import LLMProvider, LLMResponse, LLMMessage class MyProvider(LLMProvider): def complete(self, messages, tools=None, temperature=0.1, max_tokens=4096, system=None) -> LLMResponse: # Call your LLM here ... def get_model_name(self) -> str: return "my-custom-model" cog = CogOS(provider=MyProvider())

Summary: CogOS is provider-agnostic by design. Pass your existing provider in, and CogOS adds 55 domain modules, multi-agent orchestration, memory, caching, and 70 tools on top. No lock-in, no duplicated config.