Nvidia launched a brand new household of synthetic intelligence (AI) fashions on Tuesday at its GPU Know-how Convention (GTC) 2025. Dubbed Llama Nemotron, these are the corporate’s newest reasoning-focused massive language fashions (LLMs) which can be designed to supply a basis for agentic AI workflows. The Santa Clara-based tech big stated these fashions had been geared toward builders and enterprises to allow them to make superior AI brokers that may both work independently or as related groups to carry out complicated duties. The Llama Nemotron fashions are at the moment accessible by way of Nvidia’s platform and Hugging Face.
Nvidia Introduces New Reasoning-Targeted AI Fashions
In a newsroom publish, the tech big detailed the brand new AI fashions. The Llama Nemotron reasoning fashions are primarily based on Meta’s Llama three sequence fashions, with post-training enhancements added by Nvidia. The corporate highlighted that the household of AI fashions show improved capabilities in multistep arithmetic, coding, reasoning, and complicated decision-making.
The corporate highlighted that the method improved the accuracy of the fashions by as much as 20 p.c in comparison with the primarily based fashions. The inference velocity can also be stated to have been improved by 5 occasions in comparison with similar-sized open-source reasoning fashions. Nvidia claimed that “the fashions can deal with extra complicated reasoning duties, improve decision-making capabilities, and cut back operational prices for enterprises.” With these developments, the LLM can be utilized to construct and energy AI brokers.
Llama Nemotron reasoning fashions can be found in three parameter sizes — Nano, Tremendous, and Extremely. The Nano mannequin is finest fitted to on-device and edge-based duties that require excessive accuracy. The Tremendous variant is positioned within the center to supply excessive accuracy and throughput on a single GPU. Lastly, the Extremely mannequin is supposed to be run on multi-GPU servers and affords agentic accuracy.
The post-training of the reasoning fashions was executed on the Nvidia DGX Cloud utilizing curated artificial knowledge generated utilizing the Nemotron platform in addition to different open fashions. The tech big can also be making the instruments, datasets, and post-training optimisation strategies used to develop the Llama Nemotron fashions accessible to the open-source group.
Nvidia can also be working with enterprise companions to convey the fashions to builders and companies. These reasoning fashions and the NIM microservices might be accessed by way of Microsoft’s Azure AI Foundry in addition to an possibility by way of the Azure AI Agent Providers. SAP can also be utilizing the fashions for its Enterprise AI options and the AI copilot dubbed Joule, the corporate stated. Different enterprises utilizing Llama Nemotron fashions embody ServiceNow, Accenture, and Deloitte.
The Llama Nemotron Nano and Tremendous fashions and NIM microservices can be found for companies and builders as an software programming interface (API) by way of Nvidia’s platform in addition to its Hugging Face itemizing. It’s accessible with the permissive Nvidia Open Mannequin License Settlement which permits each analysis and industrial utilization.