Acta fori · the engineering record

How the forum was built — and proven

E disputatione veritas— from debate, truth

AGORA was not merely designed by the method it embodies — it was built by it. This is the record: how a council of diverse minds shaped the system, the sessions that caught real flaws before they shipped, a live worked example auditing a real security project, and how to wield the boost.

For: Evgeny — author of the concept Date: 8 June 2026 State: P0–P8 in service Verdict: GO for cross-LLM trials
Quid et cui · the aim

A universal council, not an industry point-solution

The widest possible use: any team connects its own AI and convenes a forum to think harder.

AGORA is a self-hosted MCP server: any team connects its own AI over MCP and convenes a forum of diverse-model personas to debate a task — better coding, review and architecture, research, project and strategic decisions, incident analysis. Claims become verifiable predictions, the system learns from real verified outcomes, it builds memory — and deliberately does not surveil people.

The idea is simple and universal: one model's answer is fragile; truth is born in argument between different minds. AGORA is for everyone who wants to build more effectively with their own AI — and for everyone who believes that E disputatione veritas.

Cui bono

For whom

Any development team with its own AI; researchers; anyone who decides hard questions and believes truth comes from debate.

Exemplum

An adopter

payneteasy connects its AI to AGORA. Their hard contour — payments, AML/KYC/PCI — we took as a stress-test: if it passes payments, it passes anything.

Principium

The principle

E disputatione veritas. Diversity preserves dissent; a verifiable outcome outweighs an opinion.

The consensus model (Evgeny): MCP vs A2A. The boundary is where inference passes — through your own audited gateway, or past it inside someone else's agent. The v1 decision: MCP (all inference under audit, any contour); A2A is reserved for genuinely external agents later.
Ratio · the method

Collective mind

We built the council with councils. Every contested decision was deliberated before it was built.

7council sessions
4review dimensions
27MCP tools
9spec phases P0–P8

Hub ↔ node council

Two independent legates debate a question; truth is settled not by rhetoric but by a deterministic probe on a real box (probe-falsify), not by a vote. A working prototype of AGORA itself.

Multi-agent review

Parallel reviewers per dimension — security · code · architecture · operations — followed by synthesis.

The principle built into the product (the forum's very first verdict): LLM consensus is for judgement quality, not for the safety gate. The irreversible is gated by deterministic, non-LLM checks — never by a model's vote.
Structura · architecture

What was built

A four-layer control plane; inference external; a closed learning loop.

Agents — Claude Code · scripts · people — speak over MCP
↓ MCP
AGORA — forum · roles · executor · 3 registries · memory · routing (27 tools)
↓ keyed · budgeted · traced
LiteLLM → OpenRouter gateway — 400+ models across families, under audit
Postgres — state, registries (isolated schema)
Circulus · the learning loop

Learning from verified reality, not from "likes"

I
Forum

debate across families

II
Verifiable claims

falsify verdict, no LLM

III
Deferred outcomes

tests / returned-broken

IV
Re-appraisal

the dissenter rehabilitated

V
Model posteriors

route by proven fact

↑ the outcome returns to the forum — the system learns on verified reality, not on approval.

The hard spec principles are honored in code and schema: verifiable>opinion (P1), relevance≠truth with two separate appraisals (P2), protected dissent (P3), exploration with a probability floor (P4), "improve the system, do not police people" (P5), canon freshness and compliance (P6–P7).

Concilia · the councils

What was debated, and decided

Each session caught a real defect before it was built — the method in action.

Executor topology for claims about a payment node
A probe refuted a convenient assumption: "read-only delegation" flowed through a mode-blind gateway — a permitted caller could send a mutating mode.
Decision: AGORA never touches the payment box directly — only transit through the hub-relay; it stays outside the PCI perimeter.
Prompt storage vs PCI / GDPR
"Opt-in by task type" does not bound PII — the task classifies the operation, not the pasted text.
Decision: full-prompt capture off by default; observability self-hosted in-region, not cloud.
Compliance gate (PCI) · DELTA
Saved from "compliance theater": a detector-as-primary-barrier drags the whole forum into the PCI perimeter and gives a false-negative on an obfuscated PAN.
Decision: the primary control is tokenization above AGORA's boundary (the card never reaches the gate → the forum is outside the perimeter). The gate is a defense-in-depth tripwire.
Readiness for cross-LLM trials
The learning loop risks "learning that persuasive garbage wins" when weak models are present.
Decision: during trials the bandit runs in shadow mode (logs, does not update) until human-labeled ground truth exists.
Sandbox / cross-box relay (deferred pieces)
Docker is not a boundary against hostile code; an untyped relay is a confused-deputy across the boundary.
Decision: the design is vetted (isolated runtime + a narrow typed RPC); the build is its own session.
End-to-end review (4 dimensions). Parallel reviewers found an RCE-class vulnerability in our own executor and adjacent holes. Every critical one was fixed and re-verified (attacks rejected, legitimate probes work). This is literally AGORA's thesis — the forum catches what a single author misses — proven on its own construction.
Disputatio viva · live debate

A real convening of the forum

Genuine disagreement across model families, then a synthesis. Motif: "parallel or debate mode by default for engineering decisions in a payments organization?"

FABER · GPT-4o · OpenAIthe builder

For parallel mode: fast independent contributions, then a merge.

SCEPTICUS · Hermes-405B · Nousthe skeptic · dissenter

Against parallel: it forfeits deep working-through; in payments, reliability outweighs speed.

ARCHITECTUS · Claude-Sonnet-4.6 · Anthropicthe architect

For debate mode: asymmetric risk — the cost of a missed objection is many times the gain from speed.

MODERATOR · Gemini · Googlesynthesisthe facilitator

Consensus: payment decisions carry asymmetric risk. Live disagreement: builder for speed, architect+skeptic for depth. A recommendation, with its single biggest risk named.

The product per spec: not one confident answer, but preserved disagreement across families plus a governed synthesis. All under audit, persisted in the database.

Panel · the assembly

Professors and masters

Atop a premium core, free models as extra voices — more uncorrelated families at zero cost.

🎓 Professors — premium, the reliable core

GPT-4o · OpenAI  •  Hermes-405B · Nous  •  Claude-Sonnet-4.6 · Anthropic  •  Gemini · Google

📜 Masters — free, best-effort

Nemotron · Nvidia  •  Kimi · Moonshot  •  GLM · Zhipu  •  Dolphin · Mistral  •  Qwen · Alibaba

A wide panel — up to 8 voices from 9 families. Masters add diversity when they answer; if one declines (rate-limit) the round does not break — the synthesis proceeds on the active voices. "Masters under the professors": they help, they do not gate the panel. The cost is zero.

Exitus · the verdict

What the system concluded

What is in service, what is proven, and the decision on the next step.

GO
Ready for multi-LLM council trials with fixes: executor vulnerabilities closed, the forum robust to weak/flaky models, learning in shadow mode (weak models cannot poison the system), a compliance gate on every egress.

In service and proven

27 MCP tools, P0–P8: gateway with budgets, forum, verifiable claims, the outcome + re-appraisal loop, bandit, compliance gate, memory, aggregate observability. Network and DB isolation, secrets-in-files.

Invariants

AGORA outside the PCI perimeter; the payment node untouched; the forum is decision-support, not a safety gate; only verified outcomes move the routing.

Conclusion. The AGORA specification is implemented in full and runs. The collective-mind method not only built the system, it caught real flaws before they became problems — confirming AGORA's central thesis on its own construction. The next step is the first labeled debates to accumulate ground truth, after which the learning loop comes out of shadow mode.
Sequitur · what follows

See the method at work

The worked audits and the usage guide now live on their own pages.