Trait discovery is the step that proposes named traits from a training set when the traits are not declared in the model specification. The proposed traits are returned for review; they can be renamed, merged, dropped, or edited before the model is finalized.
Discovery analyzes the contrast between positive and negative samples in the training set and returns a set of candidate traits that separate the two groups. Each candidate has a name and polarity labels at each end — the same shape as a declared trait.
Discovery operates on the same training set the model will be calibrated from. It identifies directions in the content representation along which positive and negative samples separate, labels the ends of each direction in natural language, and returns the result. The number of proposed traits depends on how many separable directions the training set supports.
Discovery makes the implicit axes of a training set explicit. When a labeler knows which content is positive and which is negative but cannot articulate why, discovery names the axes that distinguish the two. The reviewer decides whether the proposed names match the intended standard, adjusts them where they do not, and approves or edits the set before it becomes the model's declared traits.
When positive and negative samples do not separate cleanly in the content representation, discovery returns few or no traits, and any returned traits have weak separation. The downstream effect is low confidence on the calibrated model. A training set with clearer contrast — positives that consistently exemplify the standard and negatives that consistently do not — produces more separable and more interpretable traits.
When the most separable direction between positive and negative samples is unrelated to the intended standard (for example, length, topic, or source), discovery surfaces that axis as a candidate trait. This is a signal that the training set's labeling is correlated with an incidental attribute; the reviewer can drop the candidate and the contributor can adjust the training set to reduce the confound.