Grader

The Grader is a dedicated component that scores each retrieved chunk for relevance before the Quality Judge sees them. It follows the CRAG (Corrective RAG) pattern – filtering irrelevant material so the Judge only evaluates high-quality context.

Default Model

The Grader runs on qwen3-next:80b (80B, Alibaba) by default. This model supports both thinking and tool use, giving it strong reasoning for nuanced relevance scoring. In the AgentLens Live tab, Grader events appear with yellow headers.

Scoring Scale

Each chunk receives an integer relevance score from 1 to 5:

5 – directly addresses the question
4 – contains relevant information
3 – contains some relevant details
2 – vaguely related
1 – no useful information

Chunks with a score of 1 (irrelevant) are removed from the context. All remaining chunks pass to the Quality Judge with their scores attached.

Filtering and Rejection Set

Removed chunks are hashed (MD5 of text content) and added to a rejection set. On retry rounds, the orchestrator uses this set to exclude previously rejected chunks before re-grading, preventing the Retrieval Agent from re-surfacing discarded material.

If all chunks are scored 1 (entirely irrelevant), the Grader emits a synthetic ACCEPT with confidence 0.1 so the pipeline can still produce a best-effort answer.

Pipeline Position

The Grader runs after each Retrieval Agent round, before the Quality Judge. In a typical 2–3 call pipeline (one retrieval round + grader + judge accept), the Grader accounts for one LLM call. On retry rounds, it runs again on the new set of retrieved chunks.