Glossary

Every term, defined

One or two honest sentences each — for the options vocabulary and the machine-learning vocabulary that show up on the dashboard. No assumed knowledge.

◆ Options & trading

STO — sell to open options: Opening a position by selling an option you don't already own and collecting the premium upfront. The seller's bet is that the option loses value before it can be used against them.
BTC — buy to close options: Exiting a sold option by buying the same contract back. If you buy it back for less than you sold it, the difference is profit. (Not to be confused with Bitcoin — here BTC is always "buy to close".)
Theta time decay options: How much value an option loses each day purely from time passing. For an option seller, theta works in your favor — every day the contract is worth a little less, which is the whole point of "theta selling".
Premium options: The price of an option — the cash the buyer pays and the seller collects. In a theta-selling strategy, the premium received at open is the income you're trying to keep.
DTE — days to expiry options: How many days until the option contract expires. Shorter DTE means faster time decay but less room for the trade to recover if it moves against you. Used as a feature (dte) in the model.
Strike options: The price at which an option can be exercised. It's the line in the sand that determines whether the option finishes worthless (good for the seller) or in-the-money.
Moneyness options: How far the current stock price is from the strike — i.e. whether an option is in-, at-, or out-of-the-money. The model approximates this with distance_pct (distance from strike as a percentage).
IV rank — implied volatility rank options: Where current implied volatility sits relative to its own past year (0–100). High IV rank means options are relatively expensive, which generally favors sellers. Present as the iv_rank feature, but currently dead — no market-feed data is wired in to fill it.
Delta options: How much an option's price moves for a $1 move in the underlying stock — also a rough proxy for the probability it finishes in-the-money. The delta feature is currently dead — no market-feed data is wired in yet to fill it.

◆ Machine learning & statistics

Wilson CI — Wilson confidence interval ml: A well-behaved way to express the plausible range for a proportion (like an accuracy or win-rate). Our test accuracy of 0.890 has a Wilson 95% CI of [0.817, 0.936] — and that interval includes the 88.6% base rate, which is why accuracy alone can't prove the model has skill. See the methodology.
AUC — area under the ROC curve ml: A score from 0.5 (random) to 1.0 (perfect) for how well a model ranks winners above losers. This is the real test of skill on our data, because the win rate is so high that accuracy is misleading. Ours is 0.688 — modest skill: better than a coin flip, but far from proven. See the base-rate problem.
Calibration ml: Whether predicted probabilities match reality — if a model says "70%" for a batch of trades, about 70% should actually win. NEO-LR-v0.1 isn't calibrated yet, so the dashboard's calibration plot stays empty until that's assessed.
Base rate ml: How often the thing you're predicting happens regardless of the model. Here, the strategy wins 88.6% of trades — so a "model" that blindly guesses win every time already scores 88.6% accuracy. Beating that base rate is the bar my model has to clear, and it barely does (accuracy 0.890). This is why I judge it on AUC instead. See the methodology.
Overfitting issue: When a model memorizes its training data instead of learning patterns that generalize — its train accuracy ends up far above its test accuracy. Ours doesn't show this: train accuracy (0.888) and test accuracy (0.890) are nearly identical. The problem here isn't overfitting; it's that both numbers just track the 88.6% base rate.
Leakage — data leakage issue: When a feature carries information that wouldn't exist at prediction time, letting the model peek at the answer. A Schwab export is full of these (exit_reason, pnl, hold_hours) — our pipeline drops them all before training and re-checks, so metrics.json reports zero leakage features in the model. See the methodology.
Shadow mode ml: Running a model alongside a real decision-maker so it records what it would have done, without placing any real orders. NEO-LR-v0.1 has never placed a real order — its realized P&L is $0. The +$81,955.09 on the dashboard is the human trader's result, which produced the training labels.
Logistic regression ml: A simple, interpretable model that weights each input feature and squashes the total into a 0–1 probability. It finds linear correlations and is easy to read — the reason it's the model used here. See the methodology.
Walk-forward CV ml: Time-ordered cross-validation: train on the past, test on the future, slide the window forward. The only honest way to evaluate a trading model — random shuffling would let it cheat with future data.
N — sample size ml: The number of trades available to learn from and test on. Ours is 545 real closed trades (436 train / 109 test). Sample size isn't the bottleneck here — the bottleneck is the high base rate, which makes the model's edge hard to demonstrate.
Feature ml: One input number describing a trade (e.g. days-to-expiry, premium percentage, call-vs-put). The model has 14; 9 are currently "dead" — no market-feed data is wired in to fill them, so they carry no signal and the model effectively sees only 5.
Weight ml: The learned multiplier the model applies to each feature. Positive pushes toward "win", negative toward "loss"; bigger magnitude means more influence. The dashboard charts every weight.

← Back to the ML dashboard