Docs / Features / Prompt Assembly

Prompt Assembly

Every agent in the pipeline receives a purpose-built prompt. The prompt is the specification layer that defines what the agent knows, how it reasons, and what format it outputs. AgentLens designs 4 distinct prompts, one per agent role.

Per-Agent Prompt Design

ReAct5-section system prompt (role, tools, format, rules, thresholds)
Grader1-5 relevance scale, chunk-level scoring
JudgeACCEPT/RETRY structured output, forced-accept on final round
Fallback4-layer token-budgeted assembly (System, Documents, History, Question)

ReAct Agent Prompt

The ReAct agent receives a 5-section system prompt built by build_agent_system_prompt():

Role + ToolsDefines the agent as a research assistant with access to vector_search, keyword_search, and document_lookup
Output FormatEnforces Thought / Action / Final Answer structure so the ReAct loop can parse each step
RulesMaximum 5 tool calls per query, must cite sources, stop when confident
How Observations WorkTool results return 150-character previews so the agent knows to use document_lookup for full content
When to StopEmbeds BM25 + Vector quality thresholds so the agent can self-evaluate retrieval results

The prompt uses a compact 3-message format (system, user context, user query) optimized for 3B-class models. On RETRY rounds, the Judge's feedback is injected into the user message so the agent can adjust its search strategy.

Grader Prompt

The Grader scores each retrieved chunk on a 1-5 relevance scale:

1Completely irrelevant, no connection to the query
2Marginally related, mentions the topic but no useful information
3Somewhat relevant, contains partial information
4Relevant, contains information that helps answer the query
5Highly relevant, directly answers the query

The key design decision: the Grader scores whether a chunk contains relevant information, not whether it is about the topic. This distinction matters for chunks that mention a concept in passing versus chunks that explain it. Output format is bare "N: S" (chunk number: score), one per line, for reliable parsing.

Judge Prompt

The Quality Judge outputs a structured format with 5 fields:

VERDICTACCEPT or RETRY
CONFIDENCE0.0 to 1.0 score
ANSWERThe final answer text (may be empty on RETRY)
ASSESSMENTReasoning about answer quality
FEEDBACKInstructions for the ReAct agent on what to search next (on RETRY)

On the final round, the prompt includes a forced-accept instruction: the Judge must return ACCEPT regardless of quality, so the pipeline always produces an answer. If the Judge's output is unparseable, the system fails open, defaulting to ACCEPT with confidence 0.5.

4-Layer Token-Budgeted Assembly

The Fallback and Direct LLM paths use the PromptAssembler, which builds prompts from four ordered layers:

System instructionsBase system prompt, either with-docs or no-docs variant depending on whether chunks are available
Retrieved documentsChunked context from the Retrieval Agent, ordered by relevance score
Conversation historyPrior turns, added newest-to-oldest until the token budget is exhausted
Current questionThe user's query, always included in full

Token budgeting (3000 tokens via tiktoken cl100k_base) ensures the assembled prompt fits within the model's context window. If retrieved documents exceed the budget, lower-scored chunks are dropped first. The assembler returns metadata: per-layer token counts, included/dropped chunks, and history turns included.

Trace Tab Visibility

Every agent's assembled prompt is visible in the Trace tab's Prompt layer. For ReAct, Grader, and Judge stages, this shows the system prompt and injected context. For Fallback stages, the [System] / [Instructions] / [Context] separators show exactly how the 4-layer assembly was constructed and what each layer contributed.