Documentation

Install once.
Every AI tool gets CogOS.

Run cog init and CogOS registers as an MCP server. Your AI tool discovers CogOS and gets instant access to 55 domain expert modules. No API key needed — the AI tool IS the LLM, CogOS just makes it smarter.

MCP Integration — Automatic Setup

CogOS registers itself as an MCP (Model Context Protocol) server. This means your AI tool discovers CogOS automatically — no config files to edit, no instructions to paste, no existing files overwritten.

# Install and register — one command
$ pip install -e .
$ cog init
Registered CogOS MCP server with:
  - claude (~/.claude.json)
  - codex (~/.codex/config.toml)
  - gemini (~/.gemini/settings.json)
CogOS is ready. Your AI tool will discover it automatically via MCP.

After registration, your AI tool sees these CogOS tools:

MCP Tool	What it does
`cog_run`	Returns relevant expert knowledge from matching modules. The AI uses this expertise to complete the task correctly.
`cog_chat`	Ask follow-up questions about domain topics
`cog_status`	Show active modules, tools, and provider info
`cog_modules`	List available domain modules and capabilities

Supported AI tools

Tool	How it discovers CogOS	Config location
Claude Code	Auto — registers in `~/.claude.json`	`~/.claude.json`
OpenAI Codex CLI	Auto — registers in `~/.codex/config.toml`	`~/.codex/config.toml`
Gemini CLI	Auto — registers in `~/.gemini/settings.json`	`~/.gemini/settings.json`
opencode	Auto — registers in config, reads provider	`~/.config/opencode/opencode.json`
Cursor	Auto — creates `.cursor/mcp.json`	`.cursor/mcp.json`
VS Code / Cline / Roo	Auto — creates `.vscode/mcp.json`	`.vscode/mcp.json`
Goose	Auto — registers in config	`~/.config/goose/config.yaml`

Manual registration: Run cog register at any time to (re)register with all detected AI tools. Or add manually: command is python -m cog.mcp_server via stdio transport.

AGENTS.md — Auto-Instructions

When you run cog init, CogOS also writes a small instruction block to your project's AGENTS.md file. This tells AI tools (Claude Code, Codex, etc.) how to use CogOS without reading its source code.

How it works

No AGENTS.md exists — CogOS creates one with just the CogOS instruction block
AGENTS.md already exists — CogOS prepends its block above your existing content (nothing is overwritten)
AGENTS.md already has CogOS block — CogOS skips it (idempotent — safe to run cog init multiple times)

# First run — creates AGENTS.md
$ cog init
Created AGENTS.md

# Second run — skips (already has CogOS block)
$ cog init
AGENTS.md already has CogOS section (skipped)

# With existing AGENTS.md that has other content
$ cog init
Prepended CogOS info to AGENTS.md

The CogOS block is wrapped in  /  markers so it can be detected and updated independently. Your existing project instructions, framework configs, or any other content in AGENTS.md is never touched.

What the block contains

The prepended block tells AI tools three things:

What CogOS is — a modular cognitive runtime with modules, tools, and multi-agent orchestration
How to use it — call cog_run instead of reading CogOS source code
Available tools — a table of cog_run, cog_chat, cog_status, cog_modules

Why this matters: Without AGENTS.md instructions, AI tools default to reading CogOS source code (~2500 lines) to understand how it works. The AGENTS.md block short-circuits that — they see the instructions and immediately call cog_run instead. Faster, cheaper, and more reliable.

How It Works — Zero Config

CogOS uses a simple but powerful architecture: the AI tool is the LLM. When Claude Code, Cursor, or any MCP-compatible AI calls cog_run(), CogOS finds relevant expert modules and returns their knowledge. The AI then uses that expertise to complete the task.

You run cog init — Registers CogOS as MCP server with your AI tools. Auto-detects your LLM provider from environment, opencode config, or Ollama.
Your AI calls cog_run(task) — CogOS queries a chunk-level index built from all 7,275+ prompt extensions across 55 modules.
CogOS returns targeted expertise — The highest-scoring individual knowledge chunks are returned, capped to ~1,500 tokens, with session deduplication so the AI never receives the same knowledge twice.
Your AI completes the task — Using the expert context, the AI executes with deep domain knowledge it didn't have before — without burning through your token budget.

No API key needed for MCP usage. The AI tool is already authenticated and running. CogOS enriches it with expertise. Provider config is only needed for standalone CLI usage.

Token Efficiency — Built In

CogOS holds 7,275+ prompt extensions across 55 modules. Without safeguards, a single cog_run call could push 10,000–15,000 tokens of expertise into your context window before your AI writes a single line of code. That would make CogOS a liability, not an asset.

Three mechanisms work together automatically to prevent this. No configuration is required — they are active from the moment you install CogOS.

1. Chunk-level indexing

The old approach returned every extension from the top-matching modules as a block dump. The current approach treats each of the 7,275+ extensions as an individually scored document. When your AI calls cog_run("build a REST API with FastAPI"), CogOS scores every extension in the library against that specific task and returns only the highest-relevance ones — regardless of which module they came from.

A typical call now delivers ~1,500 tokens of precisely targeted knowledge instead of a broad module dump. The AI gets what it actually needs, not everything that's loosely related.

2. Session deduplication

Every chunk of expertise returned to your AI is fingerprinted with a content hash. On every subsequent cog_run or cog_chat call in the same session, already-returned chunks are excluded from the results. The response includes a chunks_skipped_dedup field so your AI knows its prior context is still valid and can build on it.

This compounds throughout a session. By the third or fourth call on the same project, CogOS is only returning knowledge that is genuinely new — not re-injecting context the AI already has. The AI's context window grows in value without growing proportionally in size.

3. Character budget

Total expertise per call is capped at a configurable character limit. The default is 6,000 characters (~1,500 tokens). Chunks are collected greedily in relevance order — the most important content first — so the budget cut always removes the lowest-value tail, not random content.

The budget is tunable for projects that genuinely need richer context per call:

# cog.yaml
max_expertise_chars: 6000   # default — ~1,500 tokens per call

# Or via environment variable
export COG_MAX_EXPERTISE_CHARS=10000

What the AI sees

Every cog_run response includes metadata so your AI can reason about what it received and what's been deduplicated:

{
  "chunks_returned": 12,
  "chunks_skipped_dedup": 8,
  "total_chars": 5840,
  "modules_contributing": [
    { "name": "cog-code-python", "chunks": 7 },
    { "name": "cog-infra-docker", "chunks": 5 }
  ],
  "expertise": "..."
}

When all relevant chunks for a task have already been returned this session, CogOS says so explicitly rather than returning an empty response — the AI knows to proceed using the expertise already in its context.

The goal: CogOS should make your AI measurably better at complex technical tasks — without meaningfully increasing your token bill. These three mechanisms are our commitment to that. If you observe unexpected token spikes, run cog status to see session chunk counts and current budget settings.

Host AI Mode — The Default

By default, CogOS uses host AI mode: the AI tool you are already running is the LLM. CogOS does not need its own model, API key, or separate configuration.

This is the setting you get out of the box. In cog.yaml it looks like this:

provider: host # use whatever AI is already running cog

When your AI calls cog_run(), CogOS finds the relevant expert modules and returns their knowledge as context. Your AI — Claude Code, Cursor, Gemini CLI, opencode, or any other MCP-compatible tool — then uses that context to complete the task. No second API key. No second model. No extra cost.

You never need to touch cog.yaml for this to work. Host AI mode is automatic. The only reason to create a cog.yaml is to point specific internal agents at a cheaper model — which is entirely optional.

Per-Agent Model Configuration — Optional Cost Savings

CogOS has several internal agents that work on different parts of a task. Some of them — like the planner that breaks a task into steps, or the document writer that generates summaries — don't need your premium model. You can point them at a smaller, cheaper model while your main AI handles the actual execution.

Create a cog.yaml in your project root and add an agents: block:

# provider: host = use my current AI (the default)
provider: host

# Per-agent overrides — each role can use its own model
agents:
  planner:
    provider: openai
    model: gpt-4o-mini        # cheap model for task decomposition
    api_key: YOUR_KEY_HERE
    # base_url: optional, for custom / self-hosted endpoints

  document_writer:
    provider: openai
    model: gpt-4o-mini        # cheap model for doc generation
    api_key: YOUR_KEY_HERE

  # executor — writes and runs code
  # executor:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

  # researcher — web search and analysis
  # researcher:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # coder — writes implementation code
  # coder:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

  # reviewer — reviews and critiques code
  # reviewer:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # critic — finds flaws and improvements
  # critic:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # tester — tests and validates output
  # tester:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # documenter — writes inline documentation
  # documenter:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # optimizer — optimizes performance
  # optimizer:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # security — security analysis
  # security:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # architect — system design decisions
  # architect:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

memory_backend: sqlite
modules_path: modules
memory_path: cog_memory.db
log_level: INFO
max_agent_iterations: 20

Any agent role you leave out inherits the global provider. With provider: host, that means your current AI handles it at no extra cost.

Known agent roles

Role	What it does	Recommendation
`planner`	Decomposes tasks into ordered steps	Small model — short structured prompts
`document_writer`	Generates docs, summaries, and reports	Small model — templated output
`executor`	Writes and runs code, calls tools	Leave unset — use your best model (host AI)
`researcher`	Web search and multi-source analysis	Small model or leave unset
`coder`	Writes implementation code	Leave unset — use your best model
`reviewer`	Reviews and critiques code	Small model or leave unset
`critic`	Finds flaws and suggests improvements	Small model or leave unset
`tester`	Tests and validates output	Small model or leave unset
`documenter`	Writes inline documentation	Small model — templated output
`optimizer`	Optimizes performance	Small model or leave unset
`security`	Security analysis and review	Small model or leave unset
`architect`	System design decisions	Leave unset — use your best model

Works with any OpenAI-compatible endpoint. Set base_url on any agent to point at DeepSeek, Zhipu/GLM, OpenRouter, Ollama, LM Studio, or any self-hosted model — mix and match per role.

Advanced: Standalone Provider Configuration

The following options are for CLI usage (cog run, cog chat) or for embedding CogOS as a library. For MCP usage — Claude Code, Cursor, Gemini CLI, etc. — host AI mode handles everything automatically.

There are four ways to configure a standalone provider:

Pass-through — hand an existing provider object to CogOS
String-based — provide model name + API key to the constructor
Environment variables — CogOS reads standard env vars
Config file — set agents.executor in cog.yaml

Key principle: CogOS never stores or transmits your API keys. They live in your environment, your config file (which is gitignored), or your host tool. CogOS just uses them to make LLM calls on your behalf.

Pass-Through Mode (Recommended)

When you're already using an AI tool (Claude Code, Codex CLI, Gemini CLI, etc.), that tool already has an LLM client configured. Instead of duplicating configuration, just pass the provider directly:

from cog import CogOS
from cog.providers.openai_provider import OpenAIProvider

# Your host tool already built this provider
provider = OpenAIProvider(model="gpt-4o", api_key="sk-...")

# Hand it to CogOS — no second config needed
cog = CogOS(provider=provider)

result = cog.run("Analyze this codebase for security issues")
print(result["output"])

With pass-through mode:

No API keys are duplicated or stored by CogOS
The host tool's provider, model, and base_url are used directly
Works with any provider that implements the LLMProvider interface
Zero additional configuration required

Ideal for: AI tools that embed CogOS as a library. The host tool manages the LLM connection; CogOS provides the cognitive runtime (modules, agents, tools, memory).

For MCP usage, pass-through happens automatically — the AI tool IS the LLM.

String-Based Initialization

For standalone scripts or quick prototyping, provide the model name and API key directly:

from cog import CogOS

# OpenAI
cog = CogOS(llm="gpt-4o", api_key="sk-...")

# Anthropic (auto-detected from model name)
cog = CogOS(llm="claude-sonnet-4-20250514", api_key="sk-ant-...")

# Any OpenAI-compatible endpoint
cog = CogOS(llm="glm-4", api_key="...", base_url="https://open.bigmodel.cn/api/paas/v4")

The provider is auto-detected from the model name:

Model contains "claude" → Anthropic provider
Everything else → OpenAI-compatible provider (works with any endpoint via base_url)

Environment Variables

Set standard environment variables and CogOS will detect them automatically. This is the easiest way to configure CogOS without modifying code:

# CogOS-specific (highest priority)
export COG_PROVIDER=openai
export COG_MODEL=gpt-4o
export COG_API_KEY=sk-...
export COG_BASE_URL=https://api.openai.com/v1   # optional

# Or use standard provider env vars (auto-detected)
export OPENAI_API_KEY=sk-...            # → provider=openai
export OPENAI_BASE_URL=https://...      # → custom endpoint
export ANTHROPIC_API_KEY=sk-ant-...     # → provider=anthropic

Use CogOS.from_env() to create an instance from environment variables:

from cog import CogOS

cog = CogOS.from_env()
result = cog.run("Deploy this to AWS")

Resolution order

When multiple sources are set, CogOS uses this priority (highest wins):

COG_* environment variables
OPENAI_* / ANTHROPIC_* environment variables
cog.yaml config file (walks up from CWD)
Hardcoded defaults (all None — no silent fallback)

Config File (cog.yaml)

For MCP usage (Claude Code, Cursor, Gemini CLI, etc.), no config file is needed at all. The AI tool you're already running IS the LLM — CogOS uses it automatically in host AI mode. cog.yaml is only needed if you want to assign specific agents to cheaper models, or for standalone CLI usage.

The default cog.yaml generated by cog init reflects the host AI default and the per-agent pattern:

# cog.yaml — never committed to git
# "host" means: use whatever AI is already running cog
provider: host

# Optional: point specific agents at cheaper models
agents:
  planner:
    provider: openai
    model: gpt-4o-mini
    api_key: YOUR_KEY_HERE
  document_writer:
    provider: openai
    model: gpt-4o-mini
    api_key: YOUR_KEY_HERE

  # executor:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

  # researcher:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # coder:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

  # reviewer:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # critic:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # tester:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # documenter:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # optimizer:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # security:
  #   provider: openai
  #   model: gpt-4o-mini
  #   api_key: YOUR_KEY_HERE

  # architect:
  #   provider: openai
  #   model: gpt-4o
  #   api_key: YOUR_KEY_HERE

memory_backend: sqlite
modules_path: modules
memory_path: cog_memory.db
log_level: INFO
max_agent_iterations: 20

Generate one with cog init:

$ cog init
Created cog.yaml with host AI mode defaults.
Edit agents: to assign specific roles to cheaper models.

Security: cog.yaml is listed in .gitignore by default. Never commit this file if it contains API keys. For team projects, use environment variables or a secrets manager instead.

CLI Usage

The CLI reads from the config file and environment variables automatically:

# Create config
$ cog init

# Run a task
$ cog run "Create a REST API with authentication"

# Override provider/model per command
$ cog run "Debug this function" --provider openai --model gpt-4o

# Interactive chat
$ cog chat

# Show status (provider, model, modules, tools)
$ cog status

Both cog and cogos commands work identically.

Creating Custom Modules

Use cog create to scaffold a new module with the correct structure:

# Scaffold a new module
$ cog create infra-pulumi --description "Pulumi infrastructure as code"
Created modules/cog-infra-pulumi/
  modules/cog-infra-pulumi/manifest.json
  modules/cog-infra-pulumi/module.py

# Or use the full cog- prefix
$ cog create cog-lang-rust --description "Rust language module"

This generates two files:

manifest.json — module metadata

{
  "name": "cog-infra-pulumi",
  "version": "0.1.0",
  "description": "Pulumi infrastructure as code",
  "capabilities": ["pulumi_operations"],
  "requires": [],
  "permissions": ["shell.execute"],
  "entrypoint": "module.py"
}

module.py — your tools, verifiers, and domain knowledge

The scaffolded module.py includes a working Tool, Verifier, and CogModule class with TODO comments. Each module can provide:

Component	What it does	Method
Tools	Actions CogOS can take (run commands, call APIs, etc.)	`register_tools()`
Verifiers	Health checks (is the tool installed? are creds valid?)	`register_verifiers()`
Prompt extensions	Domain expertise injected into the LLM context	`get_prompt_extensions()`
Capabilities	Tags for task routing (e.g. `s3_operations`)	`get_capabilities()`
Lifecycle hooks	Setup/teardown logic (`on_load`, `pre_execute`, etc.)	Override on `CogModule`

Module naming convention

Module names follow the pattern cog-{category}-{topic}:

# Category examples
cog-cloud-aws        # cloud providers
cog-lang-rust        # programming languages
cog-infra-docker     # infrastructure / DevOps
cog-db-postgres      # databases
cog-framework-nextjs # web frameworks
cog-testing-playwright # testing tools

Sharing modules

Modules are just directories with manifest.json + module.py. To share:

Git repo — push your module directory, users clone into their modules/
Pip package — wrap the module in a Python package with a modules/ entry point
Registry — publish to a CogOS module registry (coming soon)

Tip: The requires field in manifest.json lets you declare dependencies on other modules (e.g. "requires": ["tool-core"]). CogOS resolves activation order automatically.

Supported Providers

Two provider classes are included. The OpenAI provider works with any OpenAI-compatible endpoint via the base_url parameter:

Provider	Class	Env vars	Notes
OpenAI	`OpenAIProvider`	`OPENAI_API_KEY`	GPT-4o, GPT-4, etc.
Anthropic	`AnthropicProvider`	`ANTHROPIC_API_KEY`	Claude Sonnet, Haiku, etc.
DeepSeek	`OpenAIProvider`	`COG_API_KEY` + `COG_BASE_URL`	OpenAI-compatible endpoint
Zhipu / GLM	`OpenAIProvider`	`COG_API_KEY` + `COG_BASE_URL`	OpenAI-compatible endpoint
OpenRouter	`OpenAIProvider`	`COG_API_KEY` + `COG_BASE_URL`	OpenAI-compatible endpoint
Local (Ollama, etc.)	`OpenAIProvider`	`COG_BASE_URL`	`base_url=http://localhost:11434/v1`
Any custom	`LLMProvider` subclass	Custom	Implement `complete()` and `get_model_name()`

Integrating With AI Tools

CogOS is designed to be embedded by other AI tools. Here's how different tools would integrate:

Claude Code / Anthropic tools

from cog import CogOS
from cog.providers.anthropic_provider import AnthropicProvider

provider = AnthropicProvider(model="claude-sonnet-4-20250514", api_key="sk-ant-...")
cog = CogOS(provider=provider)

OpenAI Codex / GPT tools

from cog import CogOS
from cog.providers.openai_provider import OpenAIProvider

provider = OpenAIProvider(model="gpt-4o", api_key="sk-...")
cog = CogOS(provider=provider)

Custom / OpenAI-compatible endpoints

from cog import CogOS
from cog.providers.openai_provider import OpenAIProvider

# Works with DeepSeek, Zhipu, OpenRouter, Ollama, vLLM, etc.
provider = OpenAIProvider(
    model="deepseek-chat",
    api_key="...",
    base_url="https://api.deepseek.com/v1",
)
cog = CogOS(provider=provider)

Custom provider class

Implement the LLMProvider interface for any LLM not covered by the built-in providers:

from cog.providers.base import LLMProvider, LLMResponse, LLMMessage

class MyProvider(LLMProvider):
    def complete(self, messages, tools=None, temperature=0.1,
                max_tokens=4096, system=None) -> LLMResponse:
        # Call your LLM here
        ...

    def get_model_name(self) -> str:
        return "my-custom-model"

cog = CogOS(provider=MyProvider())

Summary: CogOS is provider-agnostic by design. Pass your existing provider in, and CogOS adds 55 domain modules, multi-agent orchestration, memory, caching, and 70 tools on top. No lock-in, no duplicated config.

Install once.Every AI tool gets CogOS.

MCP Integration — Automatic Setup

Supported AI tools

AGENTS.md — Auto-Instructions

How it works

What the block contains

How It Works — Zero Config

Token Efficiency — Built In

1. Chunk-level indexing

2. Session deduplication

3. Character budget

What the AI sees

Host AI Mode — The Default

Per-Agent Model Configuration — Optional Cost Savings

Known agent roles

Advanced: Standalone Provider Configuration

Pass-Through Mode (Recommended)

String-Based Initialization

Environment Variables

Resolution order

Config File (cog.yaml)

CLI Usage

Creating Custom Modules

manifest.json — module metadata

module.py — your tools, verifiers, and domain knowledge

Module naming convention

Sharing modules

Supported Providers

Integrating With AI Tools

Claude Code / Anthropic tools

OpenAI Codex / GPT tools

Custom / OpenAI-compatible endpoints

Custom provider class

Install once.
Every AI tool gets CogOS.