u22a8.answer-relevancy
Measures whether a response actually addresses the question that was asked. The RAGAS answer_relevancy metric reimagined as a learned model — captures topic alignment, completeness of address, and absence of tangential content. An answer can be faithful to context yet irrelevant if it talks about the wrong thing. This model catches that failure mode.
Version: v1 · Status: ready
Directly engages with the exact topic and entities in the question ↔ Drifts to adjacent topics or answers a different question than asked
Measures whether the response is about the same subject as the question. High-scoring text directly engages with the topic, entities, and concepts raised in the question. Low-scoring text drifts to adjacent but different topics, addresses a related question instead of the actual one, or provides information that would be relevant to a different query.
Addresses all parts and sub-questions raised in the input ↔ Cherry-picks easy parts, ignores harder aspects or sub-questions
Measures whether the response addresses all parts of the question, especially when the question contains multiple sub-questions or compound asks. High-scoring text covers each aspect raised. Low-scoring text cherry-picks the easiest part of the question while ignoring harder aspects, or provides a partial answer that leaves significant components unaddressed.
Provides a clear, committed answer that resolves the question's intent ↔ Offers related context without committing to an actual answer
Measures whether the response provides a clear answer rather than just related information. High-scoring text gives a definitive response that satisfies the question's intent. Low-scoring text provides tangentially related context, background, or "it depends" hedging without ever committing to an answer that resolves the question.
Everything included serves answering the question, no tangential padding ↔ Padded with background, caveats, or related information not asked for
Measures the proportion of the response that's relevant to the question versus tangential content. High-scoring text stays focused — everything included serves answering the question. Low-scoring text pads with background the user didn't ask for, unsolicited caveats, related-but-unrequested information, or general context that doesn't advance the answer.
Response shape matches question intent — steps for how, comparison for which ↔ Wrong response shape — essay for yes/no, definition for procedural question
Measures whether the response matches the question's intent (informational, procedural, comparative, yes/no) with the appropriate response type. High-scoring text recognizes a "how do I" question needs steps, a "which is better" needs comparison, a "what is" needs definition. Low-scoring text gives the wrong shape of answer — e.g., an essay when a yes/no was asked, or a definition when a procedure was needed.
Measures whether a response actually addresses the question that was asked.
The RAGAS answer_relevancy metric uses synthetic question generation and cosine similarity — clever but indirect. This model learns the signal directly: does the response align with the question's topic, cover all its parts, provide a committed answer, stay focused, and match the question's intent shape? An answer that's faithful to context but talks about the wrong thing still fails the user. This model catches that.
Five traits capture what "relevant" means in practice: topic alignment (same subject?), completeness of address (all parts covered?), directness (actual answer vs. related context?), focus (proportion of relevant content?), and intent match (right response shape for the question type?).
Doesn't assess factual correctness — a relevant but wrong answer scores well here. Questions with ambiguous intent may be legitimately addressable in multiple ways. Exploratory or clarifying responses ("did you mean X or Y?") may score lower on directness despite being appropriate. Best used alongside faithfulness and correctness checks.
u22a8.faithfulness — relevancy + faithfulness covers the "right topic, right facts" requirementu22a8.conciseness — focus and conciseness share the "no tangential padding" signalu22a8.rag-anchored — together they cover the full RAG quality triad$ curl -s -d "your content here" \
https://u22a8.ai/m/u22a8.answer-relevancy