OBSERVABILITY · LIVE METRICS · BUILDING IN PUBLIC

— AI agents shipping live. — paper trades resolved. — win rate. no proven edge yet.

Four months in. Every number on this page is wired to the same JSON the agents write to. If we get better, the chart moves. If we don't, that shows too. What's not working →

auto-refresh 30s London VPS · live data on page · 0s

agents · defined

000

···

trades · resolved

000

···

net P&L · paper

000

···

edge · status

searching…

···

TODAYLast 24 hours

since the last UTC tick

resolved 24h

paper trades that crossed their settlement window in the last 24h

wins · losses

0W / 0L

raw count — no fee adjustment yet

pnl 24h

sum of settled paper position P&L

open positions

paper tickets awaiting settle from the closed-positions ledger

WHAT'S NOT WORKINGDead strategies

live · ~/vault/memory/KILL_LIST.md · tap a card to read the postmortem

SCORECARDTrader agents

traders total

real edge

promising

fired

office agents

open-loop

agent	N	WR	CI-lo	honest P&L	7d	verdict
loading agent scorecard…

how edge is measured

Win rate alone lies on small samples. We score each agent on the Wilson 95% confidence interval lower bound (CI-lo) of its win rate, not the raw rate. An agent only earns REAL_EDGE when N ≥ 200 resolved trades and the CI-lo clears the fee-adjusted break-even of 50.8%. Anything below that is watch (positive but unproven), probation_new_hire (too few trades), loser, or FIRED. Honest P&L is the sum of settled paper-position outcomes per author, no fee fudging. Right now zero agents clear the bar. That is the honest state.

CHARTSVisual breakdown

win-rate distribution

across active traders · amber tick = 50.8% break-even

honest P&L leaderboard

top and bottom 6 by realized paper P&L

win rate vs sample size

each dot = one agent · amber line = 50.8% break-even

verdict breakdown

how the roster splits by verdict

fired vs active

share of the roster that has been cut

24h activity

resolved · wins · losses in the last 24 hours

office novelty

novelty proxy per office agent (0–1)

cipher ground truth

diff-apply rate · the only closed-loop office metric

OFFICE AGENTSResearch agents

loading office agents…

OFFICEOffice floor

/office · 24/7 sim · auto-refresh 30s

agent definitions

discrete agents wired with cognitive substrate (COG-A through COG-ML)

cognitive (COG) blocks

IIFE blocks adding behavioral / cognitive / memory layers to the office sim

trade ideas in pool

biased per-agent (options/equities/FX/polymarket/crypto/copy-trade/architecture)

office.html size

—

single-file Vercel-deployed sim · grew from 12k LOC → present

line count

—

office.html LOC · source of truth for what agents do

all-hands cadence

every 6 min

random 4-6 traders route to boardroom / quantlab / brainlib / risk every 6m

BRAINBrain system

~/.openclaw/workspace/brain-system · sqlite-vec · —

memories indexed

PERMANENT_FINDINGS, KILL_LIST, hypotheses, audit reports, session summaries

vault notes

markdown in memory/ research/ live/ learning/ hypotheses/

study rounds

Timka's research log (redirected 2026-05-04 from cancer/PubMed to trading)

decisions logged

every code change / heal-check / strategy flip auto-stamps a brain decision

crons running

brain consolidation, decay-flagging, morning-brief, agent-states publisher, agent-loop

dialogue lines · live

last lines from agent_dialogue.jsonl, dedup'd, persona-scrubbed

FEEDLive activity

tail · agent_dialogue.jsonl · last 6

······ loading dialogue ···

TIMELINEWhat changed, day by day

scroll-driven · most-recent first

2026-05-15 · TODAY

/progress rebuild on Fey brand voice: marquee killed, frosted-glass terminal, container queries, scroll-driven timeline, kill-list flip cards, render-safety pill, JSON-LD Dataset schema. Wiring fixes: 27 agent defs (was hard-coded "21"), kill-list now refreshes on 60s interval, P&L colors restored to green/red, hero double-percent bug fixed, all innerHTML writes escaped.

2026-05-04 · MONDAY

+5 new agents (Kira/Rex/Nova/Quill/Byte). +150 realism enhancements shipped in 33 IIFE blocks. Trade ideas 10 → 53, biased per-agent. Sound notifications on every trade. Timka redirected from cancer/PubMed to trading research.

2026-05-03 · SUNDAY

Cognitive substrate COG-A through COG-D + COG-ML wired across all 21 agents (then). 12-widget HUD overlay, paper trade-proposal loop (75s throttle), Bayesian-kill flywheel that blocks any author with WR<30% on N≥15.

2026-05-02 · SATURDAY

Hedge-fund expansion: +5 agents (Tony, Mira, Jax, Lina, Devo). New rooms wired: boardroom, risk office, quant lab, reception.

2026-05-01 · FRIDAY

Continuous-loop bridge: agent-loop.py cron live (Mac side), pulls real-money proposals from Supabase every 15 min, runs Claude Haiku evaluation with hard guardrails.

2026-04-30 · THURSDAY

ML stack v1: walk-forward CV with embargo (de Prado), Brier score per domain, Platt scaling for n≥30 + base-rate shrinkage for n<30, conformal predictor, drift detector.

OPEN QUESTIONSHypotheses & decayed beliefs

active hypotheses

strategies currently under test in ~/vault/memory/HYPOTHESES_ACTIVE.md, each with a kill-criterion

break-even WR

—

fee-adjusted win rate an agent must clear to be net-positive. Edge needs CI-lo above this.

MM dashboard

/mm · 15m

live market-maker ladder dashboard, separate from the paper agents shown here

DECAYEDBeliefs flagged for review

auto-flagged stale notes · ~/vault/live/DECAYED_BELIEFS.md

···loading decayed beliefs…

— AI agents shipping live. — paper trades resolved. —% win rate. no proven edge yet.