NIM – Nvidia Inference Microservices – A software library of conrtainerized services created by GPU hardware and software supplier Nvidia to prepare data to be used in generative AI large langauage model (LLM) training and inference. These prebuilt containers support a broad spectrum of AI models—from open-source community models to Nvidia AI Foundation models, as well as custom AI models. NIM microservices are deployed with a single command for integration into enterprise-grade AI applications using standard APIs and just a few lines of code. They are built on foundations including inference engines like Triton Inference Server, TensorRT, TensorRT-LLM, and PyTorch, and NIM is engineered to facilitate seamless AI inferencing at scale, ensuring that you can deploy AI applications on-premises or in the cloud.
Nvidia NIM Agent Blueprints is a catalog of pretrained, customizable AI workflows that equip enterprise developers with a suite of software for building and deploying generative AI applications for canonical use cases, such as customer service avatars, retrieval-augmented generation and drug discovery virtual screening. They include sample applications built with NeMo, NIM and partner microservices, reference code, customization documentation and a Helm chart for deployment.