Thinking Layer

The Thinking Layer exposes the LLM's chain-of-thought reasoning during pipeline execution. When enabled, you can see the raw internal reasoning that precedes each action or judgment.

How It Works

AgentLens sends think: true in the Ollama API payload. Models that support extended thinking return reasoning tokens in a separate thinking field alongside the main content response. AgentLens captures both and displays them as distinct layers in the Trace tab.

Think Toggle

The settings bar provides a global think toggle and per-agent overrides:

LLM_THINK – global default (true by default)
LLM_THINK_AGENT – override for ReAct Agent
LLM_THINK_GRADER – override for Grader
LLM_THINK_JUDGE – override for Quality Judge
LLM_THINK_FALLBACK – override for Fallback

Per-agent toggles let you enable thinking for just the Judge (to understand its ACCEPT/RETRY reasoning) while keeping ReAct thinking off (to reduce latency). UI checkbox overrides take priority over environment variable settings.

Trace Tab Integration

When thinking is enabled, the Trace tab's 6-layer observation model includes a dedicated Thinking layer between Prompt and Raw Response. The layer header shows the character count (e.g. "Thinking (1234 chars)") and the full reasoning text is displayed in a collapsible section.

Performance Impact

Enabling thinking increases token usage and latency since the model generates additional reasoning tokens. For debugging and prompt engineering, the insight is worth the overhead. For production queries, disabling thinking reduces cost and response time.