Trait discovery

Trait discovery is the step that proposes named traits from a training set when the traits are not declared in the model specification. The proposed traits are returned for review; they can be renamed, merged, dropped, or edited before the model is finalized.

§1Definition

Discovery analyzes the contrast between positive and negative samples in the training set and returns a set of candidate traits that separate the two groups. Each candidate has a name and polarity labels at each end — the same shape as a declared trait.

§2Mechanism

Discovery operates on the same training set the model will be calibrated from. It identifies directions in the content representation along which positive and negative samples separate, labels the ends of each direction in natural language, and returns the result. The number of proposed traits depends on how many separable directions the training set supports.

specificity GeneralSpecific
concreteness AbstractConcrete
verification AssertedEarned
Figure 1. A discovery output for a model scoring landing-page copy. Three candidate traits, each with polarity labels drawn from the contrast in the training set.

§3Interpretation

Discovery makes the implicit axes of a training set explicit. When a labeler knows which content is positive and which is negative but cannot articulate why, discovery names the axes that distinguish the two. The reviewer decides whether the proposed names match the intended standard, adjusts them where they do not, and approves or edits the set before it becomes the model's declared traits.

§4Edge cases

§4.1Weak contrast

When positive and negative samples do not separate cleanly in the content representation, discovery returns few or no traits, and any returned traits have weak separation. The downstream effect is low confidence on the calibrated model. A training set with clearer contrast — positives that consistently exemplify the standard and negatives that consistently do not — produces more separable and more interpretable traits.

§4.2Unrelated dominant axes

When the most separable direction between positive and negative samples is unrelated to the intended standard (for example, length, topic, or source), discovery surfaces that axis as a candidate trait. This is a signal that the training set's labeling is correlated with an incidental attribute; the reviewer can drop the candidate and the contributor can adjust the training set to reduce the confound.

§5Related concepts

  • Traits — the declared scoring axes; discovery proposes candidates that become declared traits after review.
  • Samples — the training input discovery operates on.
  • Briefs — the exemplars and failure modes in a brief inform the candidates discovery proposes.
  • Calibration — the step that follows discovery, setting breaks on each approved trait axis.
Scores are approximate — not a substitute for human judgment.