Confidence is a three-level signal — high, moderate, low — paired with every trait score. It reports how well the model can discriminate at the score's region of the score axis. It is a property of the training distributions, not of the content.
Every trait score is returned with a confidence level. Confidence expresses how reliably the model can distinguish content in the region where the score fell, based on how well the positive and negative training distributions separate there. Confidence is distinct from the score and from the tier; it describes the quality of the measurement, not the quality of the content.
| Level | Distribution condition at the score's region |
|---|---|
| High | Score falls clearly within one of the two training distributions (positive or negative). |
| Moderate | Score falls between the two training distributions, in the gap where neither distribution is dense. |
| Low | The two training distributions overlap at the score's region — the model cannot discriminate reliably. |
Table 1. The three confidence levels and the distribution condition each indicates.
At training time, the trait's positive and negative samples produce two distributions on the 0–100 axis. At scoring time, confidence is computed from how those distributions behave near the score: cleanly separated, gapped, or overlapping. No external calibration is applied; the signal is entirely determined by the training data and the resulting breaks.
For a score card with multiple traits, the composite's confidence is the minimum per-trait confidence — the weakest trait's confidence becomes the composite's. A composite score inherits the reliability of its least reliable input. See composite.
Tier and confidence both come from the same training distributions and therefore correlate. A trait's tier describes where on the axis the score sits; confidence describes how reliably the model can place a score at that position. The two signals answer different questions but are not independent — scores in different regions of the axis receive different confidence levels by construction:
| Tier | Confidence | Why |
|---|---|---|
| Strong | High | Inside the positive distribution's upper tail. |
| Solid | High | Inside the positive distribution's inter-quartile range. |
| Developing | Moderate | In the gap between the two distributions. |
| Weak | High | Inside the negative distribution. |
| null | Low | Distributions overlap — no tier assigned. |
Table 2. How tier and confidence pair for well-separated training distributions. Weak content carries high confidence for the same reason Strong content does: the model has direct training signal for content in that region.
A low-confidence score does not indicate that the content is borderline or problematic. It indicates that the model's training data does not cleanly separate at the score's region — a property of the model, not of the content. Two pieces of content with similar scores may carry different confidence levels if one score falls inside a dense training region and the other falls near a distributional overlap.
Low confidence implies that the breaks in the score's region cannot be placed reliably. Because the tier label and headroom both depend on break positions, they are returned as null when confidence is low. The raw score is still returned.
A Developing score is, by construction, in the gap between the training distributions. That region produces moderate confidence. The two signals therefore co-occur and carry complementary information: the tier reports the position, the confidence reports the gap.