Blind Eval Trio
Three cross-lab agents evaluate any decision blind: steelman defends, stress_test attacks, gap_finder finds what's missing. No synthesizer — you integrate.
The full canvas, before you import it
Click any node to see its config.
Click a node to select it — same as the Heym editor; the panel shows its settings.
5 nodes · Free & source-available
Blind Eval Trio
Pre-commitment self-evaluation for agent runtimes. Three cross-lab agents (OpenAI, Anthropic, Zhipu) give independent evaluations of any plan or method — without seeing each other's output.
Why this works
Models cannot reliably self-evaluate. Asking the same model to critique its own plan reproduces the original blind spots. The structural fix is cross-lab blind evaluation: three different model labs (different RLHF priors, different training distributions) playing structured adversarial roles, returning three independent perspectives that the calling agent integrates.
Architecture
chatInput
│
├── steelmanAgent (OpenAI gpt-5-nano + harness_reasoning)
├── stresstestAgent (Anthropic claude-4 + harness_anti_deception)
└── gapfinderAgent (Zhipu GLM-4.7 + harness_memory)
│
▼
setFields → { steelman, stress_test, gap_finder, usage_note }
Three agents run in parallel. Each is locked to one role and one Ejentum cognitive harness. No synthesizer agent — the three evaluations are returned raw. The integration tension between voices IS the value.
Roles
- steelmanAgent — builds the strongest case FOR the submitted method. Pure advocacy, zero smuggled critique.
- stresstestAgent — finds where the method BREAKS. Failure modes with severity tags, concrete breaking scenarios. Loaded with the Chaos Engineering skill.
- gapfinderAgent — finds what's MISSING: steps, alternatives, and names three deeper implicit assumptions.
Setup
- Get an Ejentum API key at ejentum.com and set it in each agent's MCP env field
- Add your OpenAI and Anthropic credentials to the agent nodes
- Submit any plan, method, or decision as the input text
Output
{
"steelman": "...",
"stress_test": "...",
"gap_finder": "...",
"usage_note": "Three independent evaluations, no synthesis. Integrate into your decision; do not score-and-aggregate."
}
Built by Ejentum · agent-teams repo
How to import this template
- 1Click Import → Copy JSON on this page.
- 2Open your Heym and navigate to a workflow canvas.
- 3PressCmd+V/Ctrl+V— nodes appear instantly.
- 4Add your API keys in the node config panels and click Run.
Discover more automations
- AI AgentsBatch LLM Status TrackerSend an array through the OpenAI Batch API, branch on live status updates, and collect the final per-item results.
- AI AgentsBuild Your First AI AgentA beginner-friendly interactive AI agent with conversation memory — type a message and get context-aware replies.
- AI AgentsAI Research Brief Supabase ArchiveTurn research notes into a structured AI brief and save the finished Markdown in Supabase.
- Customer SupportHITL Support Reply AgentDraft a customer-facing support response, pause for human approval, then continue with the reviewed reply.
- Marketing & SEOReddit Subreddit GET Tool CallLet an Agent call a no-key HTTP GET tool that fetches hot posts from a subreddit JSON feed.
- Customer SupportIMAP Support Inbox TriageWatch a shared mailbox, summarize incoming support email, and route urgent messages to Slack.
- Document OpsJina Web FetcherFetch clean, LLM-ready text from any URL using the Jina Reader API.
- Dev & IT OpsCursor Post NotifierMonitor the Cursor blog on a schedule and Slack-notify your team when a new post goes live.
- Marketing & SEOGemini Image CreatorGenerate images from a text prompt using Gemini's native image output.
- Document OpsPDF / DOCX Translation AgentTranslate the full text of any uploaded document using an AI agent.
- Dev & IT OpsClaude Blog MonitorMonitor the Anthropic blog on a schedule and Slack-notify your team on new Claude posts.
- App IntegrationsOpen-Meteo Weather SnapshotPull live weather (no API key) from Open-Meteo for any city coordinates — great for travel bots and dashboards.