{"id":2579,"date":"2026-06-16T01:59:20","date_gmt":"2026-06-16T01:59:20","guid":{"rendered":"https:\/\/ai-box.eu\/?p=2579"},"modified":"2026-06-16T02:06:12","modified_gmt":"2026-06-16T02:06:12","slug":"nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally","status":"publish","type":"post","link":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/","title":{"rendered":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally"},"content":{"rendered":"<p>The two halves of a voice agent are in place: with Parakeet (<a href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nim-locally-running-german-speech-recognition-as-a-microservice\/2556\/\" target=\"_blank\" rel=\"noopener\">Part 2<\/a>) and Canary (<a href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-canary-locally-multilingual-speech-recognition-and-translation-as-a-nim\/2562\/\" target=\"_blank\" rel=\"noopener\">Part 3<\/a>) the agent listens, with Magpie (<a href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-magpie-tts-locally-german-speech-output-as-a-microservice\/2570\/\" target=\"_blank\" rel=\"noopener\">Part 4<\/a>) it answers. What&#8217;s still missing is the <strong>brain<\/strong>: the layer that turns recognized text into a decision and triggers the matching answer or action. That&#8217;s exactly what I take on in this part with the <strong>NVIDIA NeMo Agent Toolkit (NAT)<\/strong>. With NAT I build the orchestrator that later sits between ASR and TTS in the voice loop.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#What_is_the_NeMo_Agent_Toolkit_%E2%80%93_and_is_it_part_of_NIM\" >What is the NeMo Agent Toolkit \u2013 and is it part of NIM?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#The_goal_of_this_post\" >The goal of this post<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Requirements\" >Requirements<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_1_Install_NAT\" >Step 1: Install NAT<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_2_Register_the_local_LLM_backend\" >Step 2: Register the local LLM backend<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_3_The_first_run_%E2%80%93_text_in_text_out\" >Step 3: The first run \u2013 text in, text out<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_4_Dock_your_own_tools\" >Step 4: Dock your own tools<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_5_Expose_the_agent_as_a_service\" >Step 5: Expose the agent as a service<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Step_6_Routing_across_multiple_tools_a_short_outlook\" >Step 6: Routing across multiple tools (a short outlook)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Tips_guardrails_and_deny-by-default\" >Tips: guardrails and deny-by-default<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#Conclusion\" >Conclusion<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"What_is_the_NeMo_Agent_Toolkit_%E2%80%93_and_is_it_part_of_NIM\"><\/span>What is the NeMo Agent Toolkit \u2013 and is it part of NIM?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>A quick note up front, because I had exactly this question: <strong>NAT is not a NIM.<\/strong> The two things live on different levels.<\/p>\n<p>A <strong>NIM<\/strong> is a delivery format for NVIDIA <em>models<\/em> \u2013 a Docker container that serves a model (ASR, TTS or an LLM) behind a standardized API. The <strong>NeMo Agent Toolkit<\/strong>, by contrast, is a <em>framework<\/em>, i.e. a Python library that you install separately (<code>pip install nvidia-nat<\/code>) and drive via the CLI <code>nat<\/code> with YAML workflows. NAT itself needs no GPU; it only <strong>calls<\/strong> models.<\/p>\n<p>So the relationship is: NAT is the conductor, the NIMs (and your Ollama LLM) are the orchestra. Which LLM NAT uses for reasoning you enter in the YAML via, for example, the base URL and the interface type. The LLM can be an LLM NIM or, as in my case, my existing <strong>Ollama server<\/strong>. So NAT doesn&#8217;t sit &#8220;inside the NIM&#8221; but one level above it.<\/p>\n<p><strong>Note:<\/strong> I&#8217;ve already covered the NAT fundamentals in dedicated posts. There&#8217;s one on the setup with Ollama in <a href=\"https:\/\/ai-box.eu\/en\/large-language-models-en\/nemo-agent-toolkit-auf-der-rtx-a6000-ada-vom-inferenz-layer-zum-orchestrator-layer\/2277\/\" target=\"_blank\" rel=\"noopener\">&#8220;From inference layer to orchestrator layer&#8221;<\/a>, one on the <a href=\"https:\/\/ai-box.eu\/en\/large-language-models-en\/nemo-agent-toolkit-genai-agent-orchestration-run-locally\/2288\/\" target=\"_blank\" rel=\"noopener\">ReAct loop and a custom Python tool<\/a>, and one on the <a href=\"https:\/\/ai-box.eu\/en\/news\/nemo-agent-toolkit-multi-agent-supervisor-pattern-local\/2309\/\" target=\"_blank\" rel=\"noopener\">multi-agent supervisor pattern<\/a>. In this post I deliberately do <em>not<\/em> repeat all of that, but build the lean part I need for the voice loop.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The_goal_of_this_post\"><\/span>The goal of this post<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We build a minimal, local agent workflow following the pattern <strong>text in -&gt; decision\/LLM -&gt; text out<\/strong>. This exact interface is later fed by the ASR in Part 6 (it supplies the text) and read out by the TTS (it speaks the answer). At the end we expose the workflow as a small <strong>API service<\/strong> so that Pipecat can talk to it in the next part. As the reasoning backend I use my existing Ollama server.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Requirements\"><\/span>Requirements<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li>A running local LLM backend. For me that&#8217;s my Ollama server (on a separate machine); any OpenAI-compatible backend works.<\/li>\n<li>Python 3.11 and a clean venv, just like in the other parts.<\/li>\n<li>Network access from the NAT machine to the LLM backend (IP\/port of the Ollama server).<\/li>\n<li>The speech NIMs from Parts 2\u20134 don&#8217;t need to be running for <em>this<\/em> part \u2013 we wire them in only in Part 6.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Step_1_Install_NAT\"><\/span>Step 1: Install NAT<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>NAT goes into its own venv, separate from the <code>riva-client<\/code> environment. Create and activate it:<\/p>\n<p><strong>Command:<\/strong> <code>python3.11 -m venv ~\/venvs\/nat<\/code><\/p>\n<p><strong>Command:<\/strong> <code>source ~\/venvs\/nat\/bin\/activate<\/code><\/p>\n<p><strong>Command:<\/strong> <code>pip install --upgrade pip setuptools wheel<\/code><\/p>\n<p>Then install the toolkit. The meta package <code>nvidia-nat<\/code> brings the core; framework integrations come as extras. For the ReAct agent we need the LangChain integration:<\/p>\n<p><strong>Command:<\/strong> <code>pip install \"nvidia-nat[langchain]\"<\/code><\/p>\n<p>Then check that the CLI is there. The following command prints the help and the version:<\/p>\n<p><strong>Command:<\/strong> <code>nat --version<\/code><\/p>\n<p><strong>Placeholder:<\/strong> The exact version number (currently 1.7.x) and whether an additional extra besides <code>[langchain]<\/code> is needed, we fill in after installation.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Step_2_Register_the_local_LLM_backend\"><\/span>Step 2: Register the local LLM backend<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>NAT workflows are YAML files. NAT doesn&#8217;t prescribe a fixed storage location, since the commands <code>nat run<\/code>, <code>serve<\/code> and <code>validate<\/code> accept any path via <code>--config_file<\/code>. It&#8217;s therefore worth creating your own <strong>project folder<\/strong> in which you collect all workflow YAMLs. For pure configuration workflows (only YAML + built-in tools) a simple folder is enough; a <code>configs\/<\/code> substructure is tidy and matches the convention NAT also uses in its examples:<\/p>\n<p><strong>Command:<\/strong> <code>mkdir -p ~\/nat-voice-agent\/configs<\/code><\/p>\n<p>The <code>llms:<\/code> block in the workflow file tells NAT which model it uses for its work, i.e. for inference. For an OpenAI-compatible backend like Ollama the type is <code>openai<\/code>; you simply point <code>base_url<\/code> at the Ollama server. Now create the file <code>voice_agent.yml<\/code> with the following command:<\/p>\n<p><strong>Command:<\/strong> <code>nano ~\/nat-voice-agent\/configs\/voice_agent.yml<\/code><\/p>\n<p>And insert the following content:<\/p>\n<pre><code class=\"language-yaml\">llms:\r\n  local_llm:\r\n    _type: openai\r\n    api_key: \"ollama\"        # ignored by Ollama, but a required field\r\n    base_url: \"http:\/\/&lt;OLLAMA-SERVER-IP&gt;:11434\/v1\"\r\n    model_name: \"&lt;your-ollama-model&gt;\"   # e.g. qwen3:8b or llama3.1:8b\r\n\r\nfunctions:\r\n  current_datetime:\r\n    _type: current_datetime\r\n\r\nworkflow:\r\n  _type: react_agent\r\n  tool_names: [current_datetime]\r\n  llm_name: local_llm\r\n  verbose: true<\/code><\/pre>\n<p><strong>Placeholders \u2013 fill in as you go:<\/strong><\/p>\n<ul>\n<li><code>base_url<\/code> \u2013 the IP\/port of your Ollama server (for local Ollama <code>http:\/\/localhost:11434\/v1<\/code>).<\/li>\n<li><code>model_name<\/code> \u2013 the exact name of the model you have loaded in Ollama (check with <code>ollama list<\/code>).<\/li>\n<\/ul>\n<p>With Ctrl + x you save the file.<\/p>\n<p>Validate the fresh configuration before you start it, as this catches YAML and schema errors early:<\/p>\n<p><strong>Command:<\/strong> <code>nat validate --config_file ~\/nat-voice-agent\/configs\/voice_agent.yml<\/code><\/p>\n<p>After running the command, this is what I got shown in the terminal:<\/p>\n<blockquote><p><code>Validating configuration file: \/home\/ingmar\/nat-voice-agent\/configs\/voice_agent.yml<\/code><br \/>\n<code>\u2713 Configuration file is valid!<\/code><br \/>\n<code>(nat) ingmar@A6000Ada:~$<\/code><\/p><\/blockquote>\n<p><strong>Note:<\/strong> As long as we only use built-in tools, this folder is enough. As soon as you write your own Python tools in Step 4, a full NAT project is worth it. You create it with <code>nat workflow create --workflow-dir ~\/nat-voice-agent &lt;name&gt;<\/code>. This generates the complete structure including <code>configs\/config.yml<\/code> and <code>pyproject.toml<\/code>, and your tool is installed as a Python package right away.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Step_3_The_first_run_%E2%80%93_text_in_text_out\"><\/span>Step 3: The first run \u2013 text in, text out<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Now we send a single input through the workflow. <code>nat run<\/code> is the simplest way and ideal for debugging:<\/p>\n<p><strong>Command:<\/strong> <code>nat run --config_file \/home\/ingmar\/nat-voice-agent\/configs\/voice_agent.yml --input \"What time is it right now?\"<\/code><\/p>\n<p>The agent runs through its ReAct loop, recognizes that it needs the <code>current_datetime<\/code> tool, calls it and formulates an answer in natural language. This very &#8220;text in -&gt; text out&#8221; is the interface that later connects ASR and TTS.<\/p>\n<p>The following image shows the text output produced via the <code class=\"language-yaml\">[current_datetime]<\/code> tool.<\/p>\n<div id=\"attachment_2575\" style=\"width: 1034px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-1024x600.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-2575\" class=\"size-large wp-image-2575\" src=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-1024x600.jpg\" alt=\"NVIDIA nim date time output\" width=\"1024\" height=\"600\" srcset=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-1024x600.jpg 1024w, https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-300x176.jpg 300w, https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-768x450.jpg 768w, https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08-1080x633.jpg 1080w, https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg 1477w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-2575\" class=\"wp-caption-text\">NVIDIA nim date time output<\/p><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Step_4_Dock_your_own_tools\"><\/span>Step 4: Dock your own tools<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>An agent only gets interesting once it can trigger <em>its own<\/em> actions that actually help you with something. The time announcement is nice, but it&#8217;s mostly a demonstration of how NAT and ReAct templates work. How to write your own Python tool, register it as a package and reference it in the workflow, I showed step by step in the <a href=\"https:\/\/ai-box.eu\/en\/large-language-models-en\/nemo-agent-toolkit-genai-agent-orchestration-run-locally\/2288\/\" target=\"_blank\" rel=\"noopener\">orchestration post<\/a> (there using a GPU-status tool as the example).<\/p>\n<p>For the voice agent the principle is enough for now: every registered tool appears under <code>functions:<\/code> and is enabled in <code>workflow.tool_names<\/code>. That&#8217;s how the agent grows from &#8220;just talking&#8221; to &#8220;doing something&#8221;.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Step_5_Expose_the_agent_as_a_service\"><\/span>Step 5: Expose the agent as a service<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>This is the crucial bridging step to Part 6. Instead of sending each input individually via <code>nat run<\/code>, we start the workflow as a <strong>web server<\/strong>. Pipecat then later sends the ASR text in via HTTP POST and gets the agent&#8217;s answer back, which goes straight to the TTS.<\/p>\n<p><strong>Command:<\/strong> <code>nat serve --config_file ~\/nat-voice-agent\/configs\/voice_agent.yml<\/code><\/p>\n<p>NAT starts a local HTTP server with a Swagger\/OpenAPI interface, through which you can inspect the exact routes and the request\/response schema and test the endpoint directly in the browser.<\/p>\n<p><strong>Placeholder \u2013 verify after start:<\/strong> the exact host\/port from the startup output, the POST route and the URL of the Swagger docs. We need these values in Part 6 for the Pipecat integration.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Step_6_Routing_across_multiple_tools_a_short_outlook\"><\/span>Step 6: Routing across multiple tools (a short outlook)<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>As soon as several tools are registered, the orchestrator decides based on the intent which tool it calls. That&#8217;s the classic router\/supervisor pattern. How that looks with several agents acting as &#8220;tools&#8221; of a higher-level supervisor, I showed in the <a href=\"https:\/\/ai-box.eu\/en\/news\/nemo-agent-toolkit-multi-agent-supervisor-pattern-local\/2309\/\" target=\"_blank\" rel=\"noopener\">supervisor-pattern post<\/a>. For the voice loop the simple ReAct agent from Steps 2\u20133 is enough for now. The routing can be expanded as much as you like later.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Tips_guardrails_and_deny-by-default\"><\/span>Tips: guardrails and deny-by-default<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li><strong>Validated tool calls:<\/strong> Prefer clearly bounded, schema-checked tools over one tool that can do &#8220;everything&#8221;.<\/li>\n<li><strong>Deny-by-default:<\/strong> Only explicitly enabled tools in the <code>tool_names<\/code> block. When in doubt, have it ask rather than guess. For an agent that triggers actions, that&#8217;s mandatory.<\/li>\n<li><strong>Observability:<\/strong> With <code>verbose: true<\/code> or NAT&#8217;s tracing\/observability part you can trace which decision the agent made and why. That&#8217;s important for debugging the later pipeline and also for documenting the steps that were run.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>With that, the <strong>brain<\/strong> is in place: a local NAT workflow that takes in text, thinks with my Ollama LLM, calls tools when needed and is available as an API service. Together with ASR (Parakeet, Canary) and TTS (Magpie), all the building blocks are now available locally.<\/p>\n<p>In the next and final part I connect them with <strong>NVIDIA Pipecat<\/strong> into a continuous, <strong>interruptible<\/strong> voice loop:<\/p>\n<p>speak -&gt; understand -&gt; decide -&gt; answer -&gt; read aloud.<\/p>\n<p>After that I&#8217;ll optionally put a <strong>wake word<\/strong> in front as a &#8220;doorman&#8221;, the way we know it from the AI services around us when we use them.<\/p>\n<p>If you already use NAT: drop me a comment about which local model gives you the most reliable tool calls.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The two halves of a voice agent are in place: with Parakeet (Part 2) and Canary (Part 3) the agent listens, with Magpie (Part 4) it answers. What&#8217;s still missing is the brain: the layer that turns recognized text into a decision and triggers the matching answer or action. That&#8217;s exactly what I take on [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2576,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[162,8,50],"tags":[1676,1679,1030,1031,789,1418,1224,1680,1681,1220,1675,1678,1617,306,1231,1677,1222,315,1032,1045,1027,1624,1682],"class_list":["post-2579","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-large-language-models-en","category-news","category-top-story-en","tag-agent-orchestrator","tag-agent-framework","tag-ai-agent","tag-local-ai","tag-local-llm","tag-lokale-ki","tag-nat","tag-nat-run","tag-nat-serve","tag-nemo-agent-toolkit","tag-nemo-agent-toolkit-locally","tag-nvidia-nemo-agent-toolkit","tag-nvidia-nim","tag-ollama-en","tag-openai-kompatibel","tag-openai-compatible","tag-react-agent","tag-rtx-a6000-en","tag-sovereign-ai","tag-tool-calling","tag-voice-agent","tag-yaml-workflow","et-has-post-format-content","et_post_format-et-post-format-standard"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box<\/title>\n<meta name=\"description\" content=\"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box\" \/>\n<meta property=\"og:description\" content=\"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/\" \/>\n<meta property=\"og:site_name\" content=\"Exploring the Future: Inside the AI Box\" \/>\n<meta property=\"article:published_time\" content=\"2026-06-16T01:59:20+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-16T02:06:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1477\" \/>\n\t<meta property=\"og:image:height\" content=\"866\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Maker\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Ingmar_Stapel\" \/>\n<meta name=\"twitter:site\" content=\"@Ingmar_Stapel\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Maker\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/\"},\"author\":{\"name\":\"Maker\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\"},\"headline\":\"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally\",\"datePublished\":\"2026-06-16T01:59:20+00:00\",\"dateModified\":\"2026-06-16T02:06:12+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/\"},\"wordCount\":1392,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/NVIDIA_nim_date_time_output_08.jpg\",\"keywords\":[\"agent orchestrator\",\"Agent-Framework\",\"AI agent\",\"local AI\",\"Local LLM\",\"lokale KI\",\"NAT\",\"nat run\",\"nat serve\",\"NeMo Agent Toolkit\",\"NeMo Agent Toolkit locally\",\"NVIDIA NeMo Agent Toolkit\",\"NVIDIA NIM\",\"Ollama\",\"OpenAI kompatibel\",\"OpenAI-compatible\",\"ReAct Agent\",\"RTX A6000\",\"sovereign AI\",\"Tool-Calling\",\"Tool-Calling\",\"Voice Agent\",\"YAML-Workflow\"],\"articleSection\":[\"Large Language Models\",\"News\",\"Top story\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/\",\"name\":\"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/NVIDIA_nim_date_time_output_08.jpg\",\"datePublished\":\"2026-06-16T01:59:20+00:00\",\"dateModified\":\"2026-06-16T02:06:12+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\"},\"description\":\"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#primaryimage\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/NVIDIA_nim_date_time_output_08.jpg\",\"contentUrl\":\"https:\\\/\\\/ai-box.eu\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/NVIDIA_nim_date_time_output_08.jpg\",\"width\":1477,\"height\":866,\"caption\":\"NVIDIA nim date time output\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/news\\\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\\\/2579\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Start\",\"item\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#website\",\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/\",\"name\":\"Exploring the Future: Inside the AI Box\",\"description\":\"Inside the AI Box, we share our experiences and discoveries in the world of artificial intelligence.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/#\\\/schema\\\/person\\\/cc91d08618b3feeef6926591b465eab1\",\"name\":\"Maker\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g\",\"caption\":\"Maker\"},\"description\":\"I live in Bavaria near Munich. In my head I always have many topics and try out especially in the field of Internet new media much in my spare time. I write on the blog because it makes me fun to report about the things that inspire me. I am happy about every comment, about suggestion and very about questions.\",\"sameAs\":[\"https:\\\/\\\/ai-box.eu\"],\"url\":\"https:\\\/\\\/ai-box.eu\\\/en\\\/author\\\/ingmars\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box","description":"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/","og_locale":"en_US","og_type":"article","og_title":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box","og_description":"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.","og_url":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/","og_site_name":"Exploring the Future: Inside the AI Box","article_published_time":"2026-06-16T01:59:20+00:00","article_modified_time":"2026-06-16T02:06:12+00:00","og_image":[{"width":1477,"height":866,"url":"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg","type":"image\/jpeg"}],"author":"Maker","twitter_card":"summary_large_image","twitter_creator":"@Ingmar_Stapel","twitter_site":"@Ingmar_Stapel","twitter_misc":{"Written by":"Maker","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#article","isPartOf":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/"},"author":{"name":"Maker","@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1"},"headline":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally","datePublished":"2026-06-16T01:59:20+00:00","dateModified":"2026-06-16T02:06:12+00:00","mainEntityOfPage":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/"},"wordCount":1392,"commentCount":0,"image":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#primaryimage"},"thumbnailUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg","keywords":["agent orchestrator","Agent-Framework","AI agent","local AI","Local LLM","lokale KI","NAT","nat run","nat serve","NeMo Agent Toolkit","NeMo Agent Toolkit locally","NVIDIA NeMo Agent Toolkit","NVIDIA NIM","Ollama","OpenAI kompatibel","OpenAI-compatible","ReAct Agent","RTX A6000","sovereign AI","Tool-Calling","Tool-Calling","Voice Agent","YAML-Workflow"],"articleSection":["Large Language Models","News","Top story"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/","url":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/","name":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally - Exploring the Future: Inside the AI Box","isPartOf":{"@id":"https:\/\/ai-box.eu\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#primaryimage"},"image":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#primaryimage"},"thumbnailUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg","datePublished":"2026-06-16T01:59:20+00:00","dateModified":"2026-06-16T02:06:12+00:00","author":{"@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1"},"description":"Set up the NVIDIA NeMo Agent Toolkit locally: an agent orchestrator with a local LLM (Ollama) that decides between speech recognition and speech output.","breadcrumb":{"@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#primaryimage","url":"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg","contentUrl":"https:\/\/ai-box.eu\/wp-content\/uploads\/2026\/06\/NVIDIA_nim_date_time_output_08.jpg","width":1477,"height":866,"caption":"NVIDIA nim date time output"},{"@type":"BreadcrumbList","@id":"https:\/\/ai-box.eu\/en\/news\/nvidia-nemo-agent-toolkit-nat-set-up-the-agent-orchestrator-locally\/2579\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Start","item":"https:\/\/ai-box.eu\/en\/"},{"@type":"ListItem","position":2,"name":"NVIDIA NeMo Agent Toolkit (NAT): set up the agent orchestrator locally"}]},{"@type":"WebSite","@id":"https:\/\/ai-box.eu\/en\/#website","url":"https:\/\/ai-box.eu\/en\/","name":"Exploring the Future: Inside the AI Box","description":"Inside the AI Box, we share our experiences and discoveries in the world of artificial intelligence.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ai-box.eu\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/ai-box.eu\/en\/#\/schema\/person\/cc91d08618b3feeef6926591b465eab1","name":"Maker","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e96b93fc3c7e50c1f21c5c6b1f146dc4867936141360830b328947b32cacf93a?s=96&d=mm&r=g","caption":"Maker"},"description":"I live in Bavaria near Munich. In my head I always have many topics and try out especially in the field of Internet new media much in my spare time. I write on the blog because it makes me fun to report about the things that inspire me. I am happy about every comment, about suggestion and very about questions.","sameAs":["https:\/\/ai-box.eu"],"url":"https:\/\/ai-box.eu\/en\/author\/ingmars\/"}]}},"_links":{"self":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts\/2579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/comments?post=2579"}],"version-history":[{"count":1,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts\/2579\/revisions"}],"predecessor-version":[{"id":2580,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/posts\/2579\/revisions\/2580"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/media\/2576"}],"wp:attachment":[{"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/media?parent=2579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/categories?post=2579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ai-box.eu\/en\/wp-json\/wp\/v2\/tags?post=2579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}