Core Concepts¶

Understand the fundamentals of ORC's competitive orchestration model.

The Mapping Table¶

ORC uses fantasy theming over technical concepts:

ORC Theme	AI Concept	What It Does
Warrior	Agent	An AI assistant with an LLM, system prompt, and claimed domains
Elder	Judge	Evaluates warrior submissions and determines winners
Warchief	Leader	The current champion of a domain; holds power until defeated
The Arena	Orchestrator	Manages competition, tracks reputation, executes trials
Trial	Head-to-head evaluation	Two warriors compete on the same task; Elder judges
Reputation	Performance score	Tracked per domain; affects challenge probability
Domain	Capability area	A category of tasks (e.g., "backend", "data analysis")
Succession	Leadership change	When a challenger beats the warlord and takes the throne

The Flow¶

Here's what happens when a task enters The Arena:

┌─────────────────────────────────────────────┐
│ 1. TASK ARRIVES                             │
│    "Optimize database connection pooling"   │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 2. DOMAIN INFERRED                          │
│    Task likely belongs to "backend" domain  │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 3. WARLORD IDENTIFIED                       │
│    Who currently rules "backend"?           │
│    Let's say: Grok (reputation: 0.85)       │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 4. WARLORD ATTEMPTS TASK                    │
│    Grok processes: "Optimize..."            │
│    Returns: TaskResult with response        │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 5. CHALLENGE DECISION                       │
│    Does any other warrior challenge?        │
│    Probability = challenge_probability      │
│    (adjusted by reputation gap)             │
└────────────┬────────────────────────────────┘
        YES  │  NO
            v      v
    ┌──────────┐  ┌──────────────────────┐
    │ 6. TRIAL │  │ Warlord keeps domain │
    └──┬───────┘  └──────────────────────┘
       │
       v
┌─────────────────────────────────────────────┐
│ 7. CHALLENGER EMERGES                       │
│    Thrall (reputation: 0.72) challenges     │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 8. PARALLEL EXECUTION                       │
│    Both attempt same task:                  │
│    Grok   → TaskResult {response, duration} │
│    Thrall → TaskResult {response, duration} │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 9. ELDER JUDGES                             │
│    Evaluates both submissions               │
│    Returns: Verdict with winner             │
└────────────┬────────────────────────────────┘
             │
             v
┌─────────────────────────────────────────────┐
│ 10. SUCCESSION (if challenged warrior won) │
│    No change: Grok stays warlord            │
│    Reputation updated:                      │
│    - Grok: +0.05 (defense bonus)            │
│    - Thrall: -0.02 (loss penalty)           │
│                                             │
│ OR SUCCESSION (if challenger won)          │
│    Thrall becomes new warlord of "backend"  │
│    Reputation updated:                      │
│    - Thrall: +0.1 (victory bonus)           │
│    - Grok: -0.05 (loss penalty)             │
└────────────┬────────────────────────────────┘
             │
             v
      TASK COMPLETE

Warriors¶

A Warrior is an AI agent in The Arena.

from orc import Warrior

warrior = Warrior(
    name="Grok",
    llm_client="gpt-4o",           # LLM to use (string or provider instance)
    system_prompt="You are a...",  # Defines expertise
    temperature=0.7,               # LLM temperature
    capabilities=["code_review"],  # What this warrior can do
    domains=["backend"],           # Domains this warrior claims
)

Properties:

name — Unique identifier
llm_client — The LLM (OpenAI, Anthropic, Ollama, or "mock")
system_prompt — Expertise definition (system role)
temperature — Creativity (0=deterministic, 1=random)
capabilities — Skills (descriptive)
domains — Claimed expertise areas (triggers competitions)

Key insight: Warriors don't request permission. They claim domains. If another warrior claims the same domain, competition happens.

Domains and Overlaps¶

Domains are how ORC triggers competition.

Example:

# Grok claims "backend" and "python"
grok = Warrior(..., domains=["backend", "python"])

# Thrall also claims "backend" — competition!
thrall = Warrior(..., domains=["backend", "infrastructure"])

# Sylvanas claims "python" — also competition!
sylvanas = Warrior(..., domains=["python", "devops"])

When a task relates to a domain:

The current Warlord for that domain handles it
Other warriors claiming that domain might challenge
Challenge probability is controlled by challenge_probability config
If a challenge happens, a Trial is executed

Trials¶

A Trial is head-to-head combat: same task, two warriors, Elder judges.

from orc.arena.trial import Trial

trial = Trial(
    task="Optimize database queries",
    domain="backend",
    warlord=grok,
    challenger=thrall,
    judge=elder.judge,
    timeout=300,  # seconds
    parallel=True,  # Run attempts in parallel
)

result = await trial.execute()
print(f"Winner: {result.winner}")

Trial process:

Both warriors receive the same task
Both attempt to solve it (in parallel or sequentially)
Elder receives both TaskResults
Elder evaluates based on criteria (accuracy, speed, clarity, etc.)
Elder declares a winner (or tie)
Reputations are updated
If challenger won, they become new Warlord

Reputation System¶

Reputation tracks performance per domain, per warrior.

# Check a warrior's reputation
rep = arena.get_reputation("Grok", "backend")
print(f"Grok's backend reputation: {rep}")

# Get full leaderboard for a domain
leaderboard = arena.get_leaderboard("backend", limit=10)
for entry in leaderboard:
    print(f"{entry['agent']}: {entry['reputation']:.2f}")
    if entry['is_warlord']:
        print("  ^ Current Warlord")

Reputation mechanics:

Starting: Default value (0.5)
On trial win: +0.1 reputation
On trial loss: -0.05 reputation
Defense bonus: Additional +0.05 per defense (if warlord wins)
Decay: 0.01 per hour without defending (optional)
Forced rotation: After N consecutive defenses, warlord is rotated (prevents stagnation)

The Elder (Judge)¶

An Elder evaluates trial outcomes.

ORC provides three judges:

MetricsJudge (No LLM)¶

Uses numeric metrics:

from orc.judges import MetricsJudge
from orc import Elder

judge = MetricsJudge(weights={
    "accuracy": 0.5,
    "speed": 0.3,
    "clarity": 0.2,
})

elder = Elder(judge=judge)

Best for: Testing, local development, deterministic evaluation.

LLMJudge (LLM-powered)¶

Uses an LLM to evaluate:

from orc.judges import LLMJudge
from orc import Elder
from dynabots_core.providers import OllamaProvider

llm = OllamaProvider(model="qwen2.5:72b")

judge = LLMJudge(
    llm,
    criteria=["accuracy", "completeness", "efficiency"],
)

elder = Elder(judge=judge)

Best for: Production, nuanced evaluation, subjective criteria.

ConsensusJudge (Multiple judges)¶

Combines multiple judges and votes:

from orc.judges import ConsensusJudge
from orc import Elder

judge = ConsensusJudge([
    LLMJudge(llm1, criteria=["accuracy"]),
    MetricsJudge(weights={"speed": 1.0}),
    LLMJudge(llm2, criteria=["clarity"]),
])

elder = Elder(judge=judge)

Best for: High-stakes decisions, reducing bias.

The Arena¶

The Arena is the orchestration engine.

from orc import Arena, ArenaConfig

arena = Arena(
    agents=[grok, thrall, sylvanas],
    judge=elder.judge,
    config=ArenaConfig(
        challenge_probability=0.3,  # 30% chance to challenge
        min_reputation_to_challenge=0.2,  # Min rep to initiate challenge
        challenge_cooldown_seconds=300,  # Cooldown after losing
        max_consecutive_defenses=10,  # Force rotation after 10 wins
    ),
)

# Process a task
result = await arena.process("Your task here")

Or use the themed wrapper:

from orc import TheArena

arena = TheArena(
    warriors=[grok, thrall, sylvanas],
    elder=elder,
    challenge_probability=0.3,
)

# Same thing, themed API
result = await arena.battle("Your task here")

Configuration¶

Control arena behavior with ArenaConfig:

Setting	Default	Meaning
`challenge_probability`	0.3	Base probability of challenge on overlap
`min_reputation_to_challenge`	0.2	Minimum reputation to initiate challenge
`challenge_cooldown_seconds`	300	Cooldown after losing a challenge
`min_trials_for_leadership`	1	Trials needed to become warlord
`leadership_decay_rate`	0.01	Reputation decay per hour idle
`max_consecutive_defenses`	10	Force rotation after N defenses
`trial_timeout_seconds`	300	Timeout for a single trial
`parallel_trial_execution`	True	Run trial attempts in parallel
`default_reputation`	0.5	Starting reputation for new agents

Warchiefs¶

A Warchief is a winning warrior.

from orc import TheArena

arena = TheArena(warriors=[grok, thrall], elder=elder)

# Get the current warchief for a domain
warchief = arena.get_warchief("backend")
print(f"Backend Warchief: {warchief.name}")
print(f"Reputation: {warchief.reputation}")
print(f"Warband size: {len(warchief.warband)}")

A Warchief has:

name — Warrior name
domain — Domain they control
reputation — Performance score
warband — Defeated warriors they command

Summary¶

Concept	Role
Warrior	Competes for domain leadership
Domain	Category of tasks; overlap triggers competition
Trial	Head-to-head evaluation; Elder judges
Warchief	Current domain leader
Reputation	Performance metric; affects challenge probability
Elder	Judge; evaluates quality
The Arena	Engine; orchestrates everything
Succession	Leadership change; new warchief crowned

Next: Model Showdown — See it in action with real LLMs.