NEO-LR-v0.1
A model card is the ML industry's standard one-page "nutrition label" for a model: what it is, what it's for, where it breaks, and who it can hurt. This one is honest to a fault: the model trains on real money but is not yet proven to add skill. Do not trade real money on it.
Every number above is wired live from metrics.json — the same file the dashboard reads. If the model retrains, these numbers update themselves. Nothing here is rounded up to look better.
What it's for
- A learning exercise: building a real ML pipeline end-to-end (ingest → features → train → evaluate) in public, on real money data.
- An honesty benchmark: it watches my own theta-selling options trades and records where a model would agree or disagree — without risking a cent.
- A baseline to beat: it establishes how hard the real problem is (telling the rare losses from the many wins) before any market-feed features are added.
Out of scope
- Not a trade signal, alert service, or "buy/sell" recommendation.
- Not financial advice. Nothing on these pages is.
- Not validated on a live market — it has never placed a real order.
- Not yet proven to beat its own base rate, so not safe to act on (see §5 and §6).
What's in it
545 real closed options trades exported from a Charles Schwab Realized Gain/Loss report — every fill is real money, not a paper or synthetic trade. The set spans the 2025 full year plus 2026 year-to-date and covers many underlyings (CIEN, SNDK, LITE, MU, QQQ, ASML, CRWD and others), not a single ticker. Each closed trade becomes one labeled row: win or loss by realized P&L.
Across those trades the strategy won 483 and lost 62 for a net realized P&L of +$81,955.09. That is the strategy's real track record — and the source of the modeling challenge: a strategy that wins this often produces a heavily imbalanced dataset (88.6% wins), which makes raw accuracy a misleading score (see §5).
SCHWAB_GL). Real fills, real P&L. The split is time-ordered (436 earliest trades to train, 109 most recent to test), never a random shuffle.The model takes 14 numeric features per trade. They fall into two buckets:
Live (non-zero weight, real variation)
Only 5 of the 14 features actually carry signal. Weights come straight from metrics.json:
right— call vs put indicator w = −0.363premium_pct— premium as a percentage w = −0.180contracts— position size w = +0.062dte— days to expiry w = +0.042weekday— day of week w = +0.005
Dead (zero weight, no data yet)
9 of the 14 features are dead — the export carries no value for them, so they had no variation
and contributed nothing: vix, vix_change_1h, iv_rank, delta,
hour, stock_return_1d, stock_return_5d, distance_pct,
stock_price. These aren't bad features — there's just no market-feed data wired in yet to fill them.
They are exactly the inputs most likely to help spot losses (see §6), which is why the model's skill is still
limited.
win, pnl, exit_reason, hold_hours, ml_score, etc.) before fitting. metrics.json reports zero leakage features in the model. Every input above is known at trade-entry time.The takeaway is consistent across every metric: the accuracy numbers look high only because the strategy itself wins so often. Strip that away and the model has shown modest separating power (AUC 0.688) and nothing more. See the methodology for why the base rate makes accuracy misleading and AUC the real test.
Has it traded?
No. The model has never placed a real order — its realized P&L is $0. The +$81,955.09 net P&L on this page is the human's realized result that produced the training labels, not the model's. Nothing here is evidence the model would have done better or worse than the human; it simply hasn't been tested with money.
- Barely beats the base rate. 0.890 accuracy vs a 88.6% always-win baseline. Skill is unproven.
- 9 of 14 features are dead. No market-feed data (VIX, IV rank, delta, intraday returns) is wired in — exactly the inputs most likely to flag a losing trade. The model effectively sees only 5 dimensions.
- Heavily imbalanced. Only 62 losses out of 545 trades, so the model has few examples of the case that matters most.
- Single account. All trades come from one trader's Schwab history; nothing here is shown to transfer to other traders or styles.
- Logistic regression is simple. It only finds linear correlations — it can't capture interactions a more flexible model might.
- No calibration. Predicted probabilities aren't calibrated yet; the dashboard's calibration plot stays empty until that's assessed.
- Financial harm: options trading can lose more than the premium. A model that looks confident but isn't could induce overconfident bets. That's exactly why it's locked to shadow mode and labeled "NOT A LIVE SIGNAL" everywhere.
- Overclaiming: the whole project's premise is radical transparency. We lead with the embarrassing truth — the model barely beats guessing "win", AUC is only 0.688 — precisely so no one mistakes a profitable strategy for a proven model.
- Privacy: the training data is the model owner's own Schwab trade history. No third-party personal data is used.
- No automated execution: the model has no broker connection and cannot place orders.
Version history lives on the changelog. The live model object is always reflected in metrics.json.
← Back to the ML dashboard