The science behind RadiantHealth
We don't invent metrics. RadiantHealth uses well-established exercise physiology signals — Banister's training impulse, Coggan's CTL/ATL/TSB, HRV deviation from a personal baseline, sleep stage composition — and layers an agentic LLM on top to translate them into a plan. Here's the methodology, in plain English.
Readiness (0–100)
Readiness is a composite score we compute daily from four components. Each is z-scored against your personal 28-day rolling baseline, then weighted and clamped into [0, 100].
| Component | Weight | Source signal |
|---|---|---|
| Resting HR deviation | 25% | Nightly lowest HR (Garmin / Oura / Whoop / Apple Health) |
| HRV deviation | 35% | RMSSD or rMSSD-equivalent overnight |
| Sleep efficiency + duration | 25% | Total sleep / time in bed, with stage breakdown when available |
| Training stress balance | 15% | TSB from Coggan CTL/ATL (see below) |
If fewer than three components are available for today we return has_sufficient_data: false and the coach will not issue a readiness-derived recommendation. This is the honest version of "we don't know, so we won't make up advice."
Training load — CTL, ATL, TSB
We follow Andrew Coggan's Performance Management Chart model, which is the standard in endurance coaching software:
- TSS (Training Stress Score) — a daily load number per session. Computed from power (cycling) or normalised to HR-zone minutes when power isn't available.
- CTL (Chronic Training Load) — 42-day exponentially-weighted average of TSS. Your "fitness."
- ATL (Acute Training Load) — 7-day exponentially-weighted average of TSS. Your "fatigue."
- TSB (Training Stress Balance) — CTL – ATL. Your "form." Positive means fresh, negative means fatigued.
These definitions are Coggan & Allan (Training and Racing with a Power Meter, 3rd ed.) — we don't deviate. We do add per-user zone heuristics when power data is missing so running-only users still get a meaningful TSB.
TSB labels
| TSB band | Label | What the coach does with it |
|---|---|---|
| ≥ +15 | Very fresh | Will push harder sessions if HRV + sleep agree. |
| +5 to +15 | Fresh | Standard hard-day prescriptions allowed. |
| –5 to +5 | Balanced | Follow the existing plan. |
| –15 to –5 | Functional overload | OK if HRV/sleep are holding; protect the key session. |
| < –15 | Deep fatigue | Will propose shifting intensity out by ≥ 24h. |
HRV — why we use deviation, not absolute numbers
Absolute HRV values are meaningless between people — a 90 ms rMSSD in a 28-year-old endurance athlete is unremarkable; in a 55-year-old beginner it'd be extraordinary. We only care about your HRV relative to your baseline.
Specifically: we compute a 7-day rolling average of overnight rMSSD (or the equivalent your device reports — Garmin's 7-day avg, Oura's HRV balance, Whoop's baseline) and compare last night to that average. Deviations greater than one standard deviation (again personal) are flagged for the coach. Two consecutive days below one SD is treated as a recovery signal, not noise.
Reference: Plews, D. et al. "Training adaptation and heart rate variability in elite endurance athletes." Sports Med 43, 773–781 (2013).
Sleep — quality over quantity
We read sleep events (not just duration) from Oura, Garmin, Apple Health, and Android Health Connect where available. The coach cares about three numbers:
- Total sleep time vs your 14-day personal average.
- Sleep efficiency = time asleep / time in bed. < 85% is a flag for most adults.
- Wake-after-sleep-onset (WASO) — minutes awake after first falling asleep. > 30 min is a stronger fatigue signal than short total sleep.
We don't report REM / deep percentages as coaching signals. Consumer wearables disagree dramatically on sleep-stage classification and the literature (see Chinoy et al., Sleep 2020) shows low concordance with polysomnography. They make a great bedroom stat, a poor coaching input.
Why we don't use a "recovery score" in isolation
Whoop's recovery, Oura's readiness, Garmin's body battery are all summary scores that hide their inputs. RadiantHealth ingests the underlying signals (RHR, HRV, sleep components) and rebuilds a readiness number that also includes training load — something single-source scores cannot do. When we do surface "Whoop recovery 78%" or "Oura readiness 84", we treat it as a second-opinion, not the primary signal.
The agent itself
The model layer is an agentic loop around OpenAI GPT-4o-mini (default tier) with tool-calling grounded in your actual data. Key guardrails:
- Tool budget: 5 tool calls max per turn to prevent runaway agent loops.
- Cost cap: Rolling 30-day USD ceiling per user — we stop the turn before a runaway cost event.
- Safety: Centralised guardrails refuse medical diagnosis, body-image harmful content, and extreme calorie/deficit prescriptions.
- Grounding: The reasoning prompt requires the agent to cite which tool returned which datum. No citation → we post-flight scrub the claim.
We run a private evaluation set of 120 graded cases (pilot-user anonymised) on every model change. If coaching quality regresses more than 5% on the graded set we don't ship.
References
- Coggan, A. & Allen, H. Training and Racing with a Power Meter, 3rd ed. VeloPress.
- Banister, E.W. (1991). Modeling elite athletic performance.
- Plews, D. et al. (2013). Training adaptation and heart rate variability in elite endurance athletes. Sports Med 43, 773–781.
- Chinoy, E.D. et al. (2020). Performance of seven consumer sleep-tracking devices compared with polysomnography. Sleep 44:5.
- Halson, S. (2014). Monitoring training load to understand fatigue in athletes. Sports Med 44, 139–147.
Coaching should be boring and correct.
RadiantHealth makes the methodology transparent and the recommendations concrete. See what your real numbers say tomorrow.
Get RadiantHealth for Android