Evaluations
Four canonical metrics.
Each measures a dimension of AI behavior that matters for trust, coherence, and temporal alignment. Built from empirical findings, designed for production use.
The Observatory
A unified evaluation toolkit for all Course Correct Labs studies
Standardized metrics, cross-study analysis, visualizations, and a flagship Reasoning Stability Observatory notebook.
