NVIDIA Releases Nemotron 3 Ultra Open Model

AI-Agents Research Architecture

TL;DR: NVIDIA has launched Nemotron 3 Ultra, a 550B hybrid Mamba-Transformer MoE open model optimized to run agentic reasoning workflows up to five times faster.

Summary: NVIDIA has released Nemotron 3 Ultra, a 550B parameter Mixture-of-Experts (MoE) open model designed for long-running autonomous agents. The model features a hybrid Mamba-Transformer MoE architecture that enables more reasoning cycles within the same time budget. It delivers up to 5x faster inference and 30% lower costs for complex tasks like coding, deep research, and enterprise orchestration.

Why it matters: This open-weights model lowers the barrier to deploying highly capable, low-latency local or hosted orchestrators for complex agentic workflows. Developers should evaluate Nemotron 3 Ultra on Fireworks AI to test its planning and failure-recovery performance.

Source: @NVIDIAAI