Agentic Smart Router

This NVIDIA Agentic Toolkit (NAT) application introduces the first integration of the NVIDIA LLM Router within a multi-framework, agent-oriented architecture. The supervisory agent and routing control plane are implemented using LangChain, while the retrieval-augmented generation (RAG) subsystem is built on LlamaIndex. Together, these components form an end-to-end intelligent agent workflow that accepts a user prompt and, by leveraging integrated retrieval and routing capabilities, dynamically determines and invokes the most appropriate model to service the request.

Core components

Retrieve Tool: This component is backed by a comprehensive knowledge base specific to the workload. It enriches the agent’s reasoning by retrieving relevant contextual information to support accurate and grounded responses.
LLM Router tool (NVIDIA LLM Blueprint): The routing layer follows NVIDIA’s LLM Blueprint design and includes a Router Server and Router Controller. The router intelligently maps the request to the most suitable model. At its core is a classifier model that evaluates the incoming prompt to determine whether it represents:
- General conversational (chit-chat) queries, (model used: llama 3.3-70b-instruct)
- More complex tasks require deeper reasoning, brainstorming, or code generation. (NVIDIA/llama-3.3 nemotron-super-49b-v1).
Observability and Monitoring: The observability layer is implemented using the open-source tool Arize-Phoenix. This component provides detailed visibility into the agent’s execution path, including action traces, decision flows, and end- to-end latency metrics, enabling effective debugging, performance analysis, and optimization.

Features

Core components