Nvidia launches a set of microservices for optimized inferencing

At its GTC conference, Nvidia today announced Nvidia NIM, a new software platform designed to streamline the deployment of custom and pre-trained AI models into production environments. NIM takes the software work Nvidia has done around inferencing and optimizing models and makes it easily accessible by combining a given model with an optimized inferencing engine and then packing this into a container, making that accessible as a microservice.

Typically, it would take developers weeks — if not months — to ship similar containers, Nvidia argues — and that is if the company even has any in-house AI talent. With NIM, Nvidia clearly aims to create an ecosystem of AI-ready containers that use its hardware as the foundational layer with these curated microservices as the core software layer for companies that want to speed up their AI roadmap.

NIM currently includes support for models from NVIDIA, A121, Adept, Cohere, Getty Images, and Shutterstock as well as open models from Google, Hugging Face, Meta, Microsoft, Mistral AI and Stability AI. Nvidia is already working with Amazon, Google and Microsoft to make these NIM microservices available on SageMaker, Kubernetes Engine and Azure AI, respectively. They’ll also be integrated into frameworks like Deepset, LangChain and LlamaIndex.

Image Credits: Nvidia

“We believe that the Nvidia GPU is the best place to run inference of these models on […], and we believe that NVIDIA NIM is the best software package, the best runtime, for developers to build on top of so that they can focus on the enterprise applications — and just let Nvidia do the work to produce these models for them in the most efficient, enterprise-grade manner, so that they can just do the rest of their work,” said Manuvir Das, the head of enterprise computing at Nvidia, during a press conference ahead of today’s announcements.”

As for the inference engine, Nvidia will use the Triton Inference Server, TensorRT and TensorRT-LLM. Some of the Nvidia microservices available through NIM will include Riva for customizing speech and translation models, cuOpt for routing optimizations and the Earth-2 model for weather and climate simulations.

The company plans to add additional capabilities over time, including, for example, making the Nvidia RAG LLM operator available as a NIM, which promises to make building generative AI chatbots that can pull in with custom data a lot easier.

This wouldn’t be a developer conference without a few customer and partner announcements. Among NIM’s current users are the likes of Box, Cloudera, Cohesity, Datastax, Dropbox
and NetApp.

“Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots,” said Jensen Huang, founder and CEO of NVIDIA. “Created with our partner ecosystem, these containerized AI microservices are the building blocks for enterprises in every industry to become AI companies.”

Source link

Previous News

Security Bite: Here’s what malware your Mac can remove

Next News

When will Apple release the new iPad Pro? Here’s what the rumors say

Nvidia launches a set of microservices for optimized inferencing

Disclaimer

Popular

Blinkit Checks In At Mumbai Airport

I found 7 Windows apps that use your PC’s NPU to improve efficiency and performance with AI — You might be surprised at what’s...

Sony’s PS5 Price Hikes Prove This Console Generation Is Far From Over. Good.

Best Apple Watch Bands of 2026: Nike, Hermés, and More

Xbox Ally and Xbox Ally X’s native SSDs are too tiny for storing games — fortunately, this amazing 1TB microSD with 245MB/s read speeds...

More Like this

Get a 27″ 1440p OLED monitor with a blazing-fast 240 Hz refresh rate for just $499 — LG’s 27GS93QE-B is $400 off right now,...

Crooks Behind $27M in ‘Refund’ Scams Busted By YouTube Pranksters After Being Lured to Fake Funeral

These smart glasses now let you read ebooks and play chess at eye level

Boston Dynamics Spot’s Interaction With the Public

Italian court rules Netflix price-hike clauses are void, orders refunds

Hey Siri, give us weather reports that work outside California

Nvidia launches a set of microservices for optimized inferencing

Disclaimer

More like this

Get a 27″ 1440p OLED monitor with a blazing-fast...

Crooks Behind $27M in ‘Refund’ Scams Busted By YouTube...

These smart glasses now let you read ebooks and...

Popular

Block title

The Changing Landscape of Undergraduate Admissions in India

McDonald’s debuts one-finger gadget that lets you move your character to ‘keep you in...

Everything New in iOS 26.5 Beta 1

EVs escaped oil but not the Strait of Hormuz

DoT SIM Binding Mandate Pushed till the End of 2026

Today’s NYT Connections: Sports Edition Hints, Answers for March 30 #553

Geekbench Claims Intel Tool Boosts Benchmark Scores by Tweaking Test Code

Startup Events

Trending News

Get a 27″ 1440p OLED monitor with a blazing-fast 240 Hz refresh rate for just $499 — LG’s 27GS93QE-B is $400 off right now,...

Crooks Behind $27M in ‘Refund’ Scams Busted By YouTube Pranksters After Being Lured to Fake Funeral

These smart glasses now let you read ebooks and play chess at eye level

Boston Dynamics Spot’s Interaction With the Public

Italian court rules Netflix price-hike clauses are void, orders refunds

About

Partnership

Contact us