Glossary
Every term, defined
One or two honest sentences each — for the options vocabulary and the machine-learning vocabulary that show up on the dashboard. No assumed knowledge.
◆ Options & trading
- STO — sell to open
- Opening a position by selling an option you don't already own and collecting the premium upfront. The seller's bet is that the option loses value before it can be used against them.
- BTC — buy to close
- Exiting a sold option by buying the same contract back. If you buy it back for less than you sold it, the difference is profit. (Not to be confused with Bitcoin — here BTC is always "buy to close".)
- Theta time decay
- How much value an option loses each day purely from time passing. For an option seller, theta works in your favor — every day the contract is worth a little less, which is the whole point of "theta selling".
- Premium
- The price of an option — the cash the buyer pays and the seller collects. In a theta-selling strategy, the premium received at open is the income you're trying to keep.
- DTE — days to expiry
- How many days until the option contract expires. Shorter DTE means faster time decay but less room for the trade to recover if it moves against you. Used as a feature (
dte) in the model. - Strike
- The price at which an option can be exercised. It's the line in the sand that determines whether the option finishes worthless (good for the seller) or in-the-money.
- Moneyness
- How far the current stock price is from the strike — i.e. whether an option is in-, at-, or out-of-the-money. The model approximates this with
distance_pct(distance from strike as a percentage). - IV rank — implied volatility rank
- Where current implied volatility sits relative to its own past year (0–100). High IV rank means options are relatively expensive, which generally favors sellers. Present as the
iv_rankfeature, but currently dead — no market-feed data is wired in to fill it. - Delta
- How much an option's price moves for a $1 move in the underlying stock — also a rough proxy for the probability it finishes in-the-money. The
deltafeature is currently dead — no market-feed data is wired in yet to fill it.
◆ Machine learning & statistics
- Wilson CI — Wilson confidence interval ml
- A well-behaved way to express the plausible range for a proportion (like an accuracy or win-rate). Our test accuracy of 0.890 has a Wilson 95% CI of
[0.817, 0.936]— and that interval includes the 88.6% base rate, which is why accuracy alone can't prove the model has skill. See the methodology. - AUC — area under the ROC curve ml
- A score from 0.5 (random) to 1.0 (perfect) for how well a model ranks winners above losers. This is the real test of skill on our data, because the win rate is so high that accuracy is misleading. Ours is 0.688 — modest skill: better than a coin flip, but far from proven. See the base-rate problem.
- Calibration ml
- Whether predicted probabilities match reality — if a model says "70%" for a batch of trades, about 70% should actually win. NEO-LR-v0.1 isn't calibrated yet, so the dashboard's calibration plot stays empty until that's assessed.
- Base rate ml
- How often the thing you're predicting happens regardless of the model. Here, the strategy wins 88.6% of trades — so a "model" that blindly guesses win every time already scores 88.6% accuracy. Beating that base rate is the bar my model has to clear, and it barely does (accuracy 0.890). This is why I judge it on AUC instead. See the methodology.
- Overfitting issue
- When a model memorizes its training data instead of learning patterns that generalize — its train accuracy ends up far above its test accuracy. Ours doesn't show this: train accuracy (0.888) and test accuracy (0.890) are nearly identical. The problem here isn't overfitting; it's that both numbers just track the 88.6% base rate.
- Leakage — data leakage issue
- When a feature carries information that wouldn't exist at prediction time, letting the model peek at the answer. A Schwab export is full of these (
exit_reason,pnl,hold_hours) — our pipeline drops them all before training and re-checks, sometrics.jsonreports zero leakage features in the model. See the methodology. - Shadow mode ml
- Running a model alongside a real decision-maker so it records what it would have done, without placing any real orders. NEO-LR-v0.1 has never placed a real order — its realized P&L is $0. The +$81,955.09 on the dashboard is the human trader's result, which produced the training labels.
- Logistic regression ml
- A simple, interpretable model that weights each input feature and squashes the total into a 0–1 probability. It finds linear correlations and is easy to read — the reason it's the model used here. See the methodology.
- Walk-forward CV ml
- Time-ordered cross-validation: train on the past, test on the future, slide the window forward. The only honest way to evaluate a trading model — random shuffling would let it cheat with future data.
- N — sample size ml
- The number of trades available to learn from and test on. Ours is 545 real closed trades (436 train / 109 test). Sample size isn't the bottleneck here — the bottleneck is the high base rate, which makes the model's edge hard to demonstrate.
- Feature ml
- One input number describing a trade (e.g. days-to-expiry, premium percentage, call-vs-put). The model has 14; 9 are currently "dead" — no market-feed data is wired in to fill them, so they carry no signal and the model effectively sees only 5.
- Weight ml
- The learned multiplier the model applies to each feature. Positive pushes toward "win", negative toward "loss"; bigger magnitude means more influence. The dashboard charts every weight.