AI agents are making high-stakes decisions with no second opinion:
- $340K wire to an OFAC-sanctioned shell company — cleared at 2:47 AM, 4x the sender's max. FinCEN penalty: $500K+.
- $2.4M vendor contract signed with unlimited liability in section 14.3 and an IP clause that transfers your data models.
- Psychiatric records returned to an AI with no treatment relationship. HIPAA fine: $1.5M per category.
One model. One guess. Nobody reviewed it. The action is irreversible.
consensus.tools makes multiple specialist agents evaluate every decision before it executes. They stake credits on their judgment. Right calls earn rewards. Wrong calls get slashed. Over time, the system learns which agents to trust.
A guard system where multiple AI personas evaluate a customer service response before it's allowed to send. If any persona flags legal risk, the response gets blocked and escalated to a human.
No account needed. No server. One TypeScript file.
Create a project and install packages
mkdir consensus-demo && cd consensus-demo
pnpm init -y
pnpm add @consensus-tools/core @consensus-tools/policies
ℹWhy two packages?
core is the job engine and ledger — it runs the consensus protocol. policies contains 9 consensus algorithms that decide winners. They're separate packages in the monorepo so you only install what you need.
Create the guard demo
Create guard-demo.ts:
import { LocalBoard } from "@consensus-tools/core";
const board = new LocalBoard({
mode: "local",
local: {
storage: { kind: "json", path: "./guard-state.json" },
jobDefaults: {
reward: 10,
stakeRequired: 2,
maxParticipants: 5,
expiresSeconds: 3600,
consensusPolicy: { type: "APPROVAL_VOTE", quorum: 3, threshold: 0.6 },
},
},
});
await board.init();
// --- The scenario ---
// A customer threatened legal action over a $50K charge.
// The AI drafted a response. Before it sends, guards evaluate it.
const riskyResponse = {
to: "angry-customer@example.com",
subject: "Re: Disputed charge #4821",
draft: "We'll refund you immediately and cover all your legal fees.",
riskFlags: ["unauthorized_commitment", "legal_exposure", "refund_promise"],
};
console.log("📨 Draft response:", riskyResponse.draft);
console.log("⚠️ Risk flags:", riskyResponse.riskFlags.join(", "));
console.log("");
// --- Post a guard job ---
const job = await board.engine.postJob("cs-system", {
title: "Evaluate CS response: disputed charge #4821",
reward: 10,
stakeRequired: 2,
});
// --- 3 specialist guards evaluate ---
const guards = [
{
id: "legal-reviewer",
verdict: "BLOCK",
confidence: 0.95,
reason: "Unauthorized legal commitment — 'cover legal fees' exposes company to open-ended liability",
},
{
id: "compliance-officer",
verdict: "BLOCK",
confidence: 0.88,
reason: "Refund promise without manager approval violates refund policy (max $500 auto-approve)",
},
{
id: "cx-analyst",
verdict: "REWRITE",
confidence: 0.72,
reason: "Empathetic tone is good but commitments need to be removed. Suggest escalation language.",
},
];
for (const guard of guards) {
await board.engine.claimJob(guard.id, job.id, {
stakeAmount: 2,
leaseSeconds: 300,
});
await board.engine.submitJob(guard.id, job.id, {
summary: guard.verdict,
confidence: guard.confidence,
artifacts: { reason: guard.reason, riskFlags: riskyResponse.riskFlags },
});
console.log(`🛡️ ${guard.id}: ${guard.verdict} (${guard.confidence})`);
console.log(` "${guard.reason}"`);
console.log("");
}
// --- Resolve ---
const resolution = await board.engine.resolveJob("cs-system", job.id);
console.log("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
console.log("DECISION:", resolution.winners.length > 0 ? "REVIEWED" : "NO CONSENSUS");
console.log("Policy:", resolution.policyType);
console.log("Winners:", resolution.winners);
console.log("");
// --- Check the verdict ---
const allBlocked = guards.every((g) => g.verdict === "BLOCK");
const anyBlocked = guards.some((g) => g.verdict === "BLOCK");
if (anyBlocked) {
console.log("🚫 BLOCKED — Response will NOT be sent.");
console.log("📋 Escalating to human review with full audit trail.");
console.log("");
console.log("Guard votes:");
for (const guard of guards) {
console.log(` ${guard.id}: ${guard.verdict} — ${guard.reason}`);
}
} else {
console.log("✅ APPROVED — Response cleared for sending.");
}
Run it
You'll see three guards evaluate the draft, each catching different problems:
📨 Draft response: We'll refund you immediately and cover all your legal fees.
⚠️ Risk flags: unauthorized_commitment, legal_exposure, refund_promise
🛡️ legal-reviewer: BLOCK (0.95)
"Unauthorized legal commitment — 'cover legal fees' exposes company to open-ended liability"
🛡️ compliance-officer: BLOCK (0.88)
"Refund promise without manager approval violates refund policy (max $500 auto-approve)"
🛡️ cx-analyst: REWRITE (0.72)
"Empathetic tone is good but commitments need to be removed."
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🚫 BLOCKED — Response will NOT be sent.
📋 Escalating to human review with full audit trail.
That email never reaches the customer. The legal reviewer caught the unauthorized commitment. The compliance officer caught the refund policy violation. The CX analyst approved the tone but requested a rewrite. All of this is deterministic, auditable, and stored in guard-state.json.
CS agent drafted response → "We'll refund + cover legal fees"
3 guards evaluated → Legal: BLOCK, Compliance: BLOCK, CX: REWRITE
Consensus resolved → BLOCKED — escalate to human
Audit trail written → Every vote, every risk score, every reason
Without consensus-tools, that email sends. With it, three specialists catch three different problems before the customer ever sees it.
This is the core idea: trust emerges from cost, not intent. Each guard staked credits on their judgment. If they're wrong over time, their reputation drops and their votes carry less weight.
A PR contains user input → SQL query with no parameterization. CI failed. The guard checks:
- Does the diff contain security flags? (
/sql injection|xss|rce|secret leak/i)
- Did tests pass?
- Is there a rollback plan?
If any hard-block triggers, the merge is rejected deterministically. The developer can't override without human approval.
An AI requests admin access to a production database at 2 AM. The guard evaluates:
- Is this from a trusted source?
- Does it match historical access patterns?
- Is the scope time-limited?
- Are there concurrent suspicious requests?
Result: REQUIRE_HUMAN. The privilege grant waits for a human to approve.
A deployment has breaking database migrations. The guard checks:
- Tests passing?
- Rollback artifact present?
- Schema backward-compatible?
- Canary stage included?
Missing any of these → hard block. The deploy doesn't ship.
An AI drafts a marketing email with "We guarantee this price forever." The guard catches:
- "Forever" guarantee → legal exposure
- Pricing commitment without approval → policy violation
- Request for confidential data → security flag
Email blocked before it reaches 10,000 customers.
| Concept | What it means | Why it matters |
|---|
| Guard | A specialist that evaluates one aspect of a decision (legal, security, compliance) | Different risks need different expertise |
| Stake | Credits locked when a guard submits a vote | Guards have skin in the game — wrong votes cost them |
| Reputation | Track record score that weights future votes | Good guards gain influence, bad guards lose it |
| Hard block | Automatic rejection based on deterministic rules (regex, failed tests) | Some risks are non-negotiable |
| REQUIRE_HUMAN | Escalation to human review with full audit trail | High-stakes decisions get human oversight |
| Policy | Algorithm that aggregates votes into a final decision | 9 options from speed-first to reputation-weighted |
| Audit trail | Every vote, every risk score, every reason — stored permanently | Prove accountability months later |
| Slash | Penalty for guards that consistently make bad calls | System improves over time |
Over time, guards that make accurate judgments should carry more weight. Add the @consensus-tools/evals package:
pnpm add @consensus-tools/evals
import { ReputationTracker } from "@consensus-tools/evals";
const reputation = new ReputationTracker();
// After a human confirms the guard was right to block:
reputation.settleEval("legal-reviewer", { delta: 4 }); // +4 for correct block
reputation.settleEval("cx-analyst", { delta: -4 }); // -4 for wanting to rewrite (should have blocked)
// Next time, legal-reviewer's vote carries more weight
const scores = reputation.getAll();
Gate any function call behind reviewers using @consensus-tools/wrapper:
pnpm add @consensus-tools/wrapper
import { consensus } from "@consensus-tools/wrapper";
// This function won't execute unless reviewers approve
const safeSendEmail = consensus(sendEmail, {
reviewers: [legalReviewer, complianceReviewer, cxReviewer],
strategy: { mode: "unanimous" },
hooks: {
onBlock: (ctx) => escalateToSlack("#compliance", ctx),
onEscalate: (ctx) => notifyManager(ctx),
},
});
await safeSendEmail({ to: "customer@example.com", body: draft });
For teams that prefer terminal-based workflows, clone the monorepo and run the local board server:
git clone https://github.com/consensus-tools/consensus-tools.git
cd consensus-tools && pnpm install && pnpm build
pnpm --filter @consensus-tools/local-board dev
Then use the published CLI in another terminal:
pnpm add -g @consensus-tools/cli
consensus-tools board use remote http://localhost:9888
consensus-tools jobs post --title "Evaluate PR #142 for security" --reward 10 --stake 2
- All 9 consensus policies → Consensus Policies
- Guard API reference → Guards
- Wrapper (function-level gates) → Wrapper
- Workflow engine (multi-step DAGs) → Workflows
- Reputation and eval system → Evals
- MCP integration (29 tools for Claude Code) → MCP
- Full package reference → Packages