NEO-LR-v0.1 | Generated: | Trained: | Source N:
NEO-LR-v0.1
Loading…
Type:
Last trained:
Age:
Max iters:
CURRENT HONEST VERDICT ·

Trained on real trades. The strategy wins of the time and has made — but the model only scores accuracy, barely above the you'd get by guessing "win" every time.

test accuracy (vs base rate) AUC (0.5 = coin flip at picking winners from losers) not a live signal — model has never traded a dollar

How to read this dashboard

This is a real machine-learning model trained on 550 real options trades — my own actual theta-selling record, pulled straight from Schwab Gain/Loss. The strategy is genuinely profitable: it wins about 88.7% of the time. But here is the honest catch, and it's the whole point of this page: because the strategy wins so often, a model that blindly guesses "win" every single time already scores ~88.7%. My model's 89.1% test accuracy barely clears that bar.

So read these two numbers side by side (N=…): real data, a real profitable strategy — but a model that hasn't yet proven it can tell winners from losers any better than the base rate. The number that actually matters is AUC (0.654), and it says the model is only modestly good at spotting the rare, large losses. Nothing here is dressed up; where the model can't honestly show something, you'll see a labelled ghost placeholder, not a fabricated chart.

Bullish weight / profit Bearish weight / loss / unreliable Early / experimental Dead / no signal Data leakage
How to read these numbers honestly

The trap is the base rate. This strategy wins ~88.7% of the time, so a do-nothing model that just predicts "win" on every trade is already right ~88.7% of the time. My model's 89.1% test accuracy sits almost exactly on top of that line — accuracy here is not evidence of skill. The honest yardstick is AUC = 0.654: a 0.5 would mean it can't tell winning trades from losing ones at all, and 1.0 would be perfect. At 0.654 the model is modestly better than chance at the one job that matters — flagging the rare, large losses before they happen — and its 95% interval still brushes the coin-flip line. Real data, real profitable strategy, model not yet proven. Treat it as a lab notebook, not a signal.

Realised P&L — the real track record

Cumulative realised P&L (every closed trade, ordered by entry)
Per-trade P&L distribution — the fat left tail

Data volume

Total trades
— / 50
Sample size is not the bottleneck anymore.
Train / Test split
440 train / 110 test — plenty of rows per feature, so train accuracy is meaningful, not memorized.
Is it overfitting?
No. With 440 training rows and only 5 live features, the model has far more data than parameters — the opposite of the tiny-sample regime where logistic regression just memorizes. Train accuracy (88.9%) and test accuracy (89.1%) sit right on top of each other, which is what a non-overfit fit looks like.
Train accuracy
N_train = — within a point of test accuracy, so no memorization
Test accuracy
Wilson 95% CI (N=)
Compare to the ~88.7% always-win base rate — the margin is tiny.
Training freshness
trained
now
today

Model fit indicators

Test AUC   THE NUMBER THAT MATTERS
0.5 = no skill, 1.0 = perfect. At 0.654 the model is only modestly good at the thing accuracy can't measure: telling winners from losers.
Strategy win rate (real Schwab)
This is the strategy's edge — and the base rate the model must beat.

Feature importance

Signed weights — red = bearish, green = bullish, hatched = leakage Explain
Bias term: . Bars are scaled to the largest-magnitude weight. Logistic weights on standardized features — direction and relative pull, not causation.
Dead features — zero weight OR std≈0 (no signal)
What "dead" means
Dead = weight <1e-6 OR std ≤1e-7 (constant across all training rows). These contributed nothing because the Schwab Gain/Loss export doesn't include market context — VIX, IV-rank, deltas and stock prices are all blank. They'll come alive once I join real market feeds to the fills.

Baselines — what is the model actually beating?

Accuracy vs trivial strategies

Win rate by segment

By option right — call vs put
By days-to-expiry bucket

Calibration plot

Predicted probability vs actual outcome

Data quality, pipeline & tests

Data quality
Pipeline
Automated guards

Feature drift & honest backtest

Feature drift (PSI)
Backtest — gating policy vs always-trade

Pre-registered kill-criteria

The bar this model has to clear — set in advance, not after the fact
The real target isn't sample size (already past it) — it's out-of-sample AUC. Progress toward the 0.70 line:
AUC — / 0.70 target
What would flip this from "not a signal" to a signal — each bar checked live against metrics.json:

Model changelog

Every retrain, appended — date · N · AUC · accuracy
Date (UTC)NAUCAccuracyNote
Loading…

The road to a real model

Next milestone strip
From a simple base-rate model to one with real, measurable skill.
Done
550 real trades ingested
Schwab Gain/Loss history loaded and leakage-checked. Past the 50-trade minimum.
Now
Beat the base rate, not just match it
Accuracy ≈ the 88.7% always-win baseline. The real job: push AUC (now 0.654) above 0.70 out-of-sample.
Next
Wake the dead inputs
Join real VIX / IV-rank / delta feeds to the fills so the model has market context to learn from.
Goal
Out-of-sample, live Schwab
Walk-forward validation on future fills, calibration on the diagonal, AUC that holds up.