Overview

What consensus.tools is and why it exists.

The problem

Your AI agent just made a decision. Nobody reviewed it. It's irreversible.

  • Fintech — Cleared a $340K wire to an OFAC-sanctioned shell company. FinCEN enforcement starts at $500K.
  • Legal — Signed a $2.4M vendor contract with unlimited liability in section 14.3. Your client data models now belong to the vendor.
  • Healthcare — Pulled psychiatric records for a cardiology recommendation. No treatment relationship. HIPAA fine: $1.5M per category.
  • Customer support — Promised "full refund and legal fees" on a $50K dispute. Auto-approve limit is $5K.

One model. One guess. No review.

Every agentic system today works this way. There's no built-in way to say "wait — let someone else check before this executes."

What consensus.tools does

consensus.tools is a decision firewall for AI agents. Before any high-stakes action executes, multiple specialist agents evaluate it — and they have skin in the game.

  • Guard system — 7 built-in guard types (send_email, code_merge, publish, support_reply, agent_action, deployment, permission_escalation) score risk and vote on whether the action should proceed
  • Four possible decisionsALLOW, BLOCK, REWRITE, or REQUIRE_HUMAN
  • Policy-driven evaluation — Guards evaluate proposals using configurable policies. Each guard votes independently, and the consensus engine aggregates votes deterministically to produce an auditable decision.
  • Reputation tracking — Over time, the system learns which guards are reliable. Good guards carry more weight. Bad guards lose influence.
  • Full audit trail — Every vote, every risk score, every reason. Queryable months later when someone asks "why did we approve that?"
  • 9 consensus policies — From speed-first (FIRST_SUBMISSION_WINS) to reputation-weighted voting (WEIGHTED_REPUTATION). Pick the algorithm that matches your risk tolerance.

The core idea: trust emerges from independent evaluation, not intent. When three independent specialists all say "block this," you block it. When a guard consistently makes bad calls, their votes stop mattering.

How it works in practice

A customer threatens legal action. Your CS agent drafts a response. Before it sends:

  1. Legal reviewer evaluates → BLOCK (unauthorized commitment to cover legal fees)
  2. Compliance officer evaluates → BLOCK (refund exceeds auto-approve limit)
  3. CX analyst evaluates → REWRITE (tone is good, but promises need to be removed)
  4. Consensus engine resolves → BLOCKED. Escalate to human with full audit trail.

The email never sends. The legal exposure never happens. Three months later, when compliance asks "what happened with that escalation?", every vote and reason is in the audit trail.

How it works technically

Agent drafts action ──▶ Guards evaluate in parallel

                       ┌──────┼──────┐
                       ▼      ▼      ▼
                    Legal  Compliance  CX
                    BLOCK    BLOCK    REWRITE
                       │      │      │
                       └──────┼──────┘

                     Consensus Policy
                     (APPROVAL_VOTE)


                    BLOCK / ALLOW / REWRITE
                    / REQUIRE_HUMAN

                    ┌─────────┼──────────┐
                    ▼                    ▼
              Execute action      Escalate to human
              (if ALLOW)         (if BLOCK/REQUIRE_HUMAN)
  1. An action is proposed (email draft, PR merge, deployment, permission grant)
  2. Multiple guard personas evaluate it in parallel using configurable policies
  3. A consensus policy aggregates their votes into a final decision
  4. The action executes, gets blocked, gets rewritten, or escalates to a human
  5. Reputation settles — guards that made the right call gain influence

Deterministic by design

Same input → same decision → same audit trail. Every time. No hidden state, no randomness. Policy functions are pure.

What you can guard

Guard typeWhat it protectsExample hard-block
send_emailOutbound communicationsUnauthorized pricing guarantees
code_mergePull request mergesSQL injection in diff, failing CI
publishContent publishingPII exposure, blocked-word lists
support_replyCustomer service responsesUnauthorized refund commitments
agent_actionAutonomous agent actionsHigh-risk API calls, data mutations
deploymentRelease deploymentsMissing rollback plan, schema incompatibility
permission_escalationPrivilege grantsOff-hours admin access, scope violations

Who it's for

  • Teams shipping AI agents that make decisions with real consequences — sending emails, merging code, deploying services, granting access
  • Compliance-heavy industries (fintech, healthcare, legal) where every AI decision needs an audit trail
  • Platform builders who want consensus-based decision-making as an infrastructure primitive
  • Anyone who's been burned by an AI agent doing something irreversible without asking first

Deployment modes

As a library

Install packages from npm and use the LocalBoard API directly in your application:

pnpm add @consensus-tools/core @consensus-tools/policies

Full stack (server + CLI)

Clone the monorepo, run the local board server on port 9888, and use the CLI:

git clone https://github.com/consensus-tools/consensus-tools.git
cd consensus-tools && pnpm install && pnpm build
pnpm --filter @consensus-tools/local-board dev

MCP (for Claude Code)

Use 29 consensus tools directly inside Claude Code via the Model Context Protocol:

{
  "mcpServers": {
    "consensus-tools": {
      "command": "npx",
      "args": ["@consensus-tools/mcp"]
    }
  }
}

Hosted service (coming soon)

Coming soon: hosted platform with managed infrastructure and persistent state.

Next steps

Build a working guard system in 5 minutes: Zero to Hero.