NVIDIA Corporation and the French company Mistral AI have announced the Mistral NeMo 12B large language model (LLM), specifically designed to address various enterprise-level tasks such as chatbots, data summarization, and working with program code.
Model Specifications and Training
The Mistral NeMo 12B boasts 12 billion parameters and utilizes a context window of 128 thousand tokens. For inference, it employs the FP8 data format, which is said to reduce memory requirements and expedite deployment without compromising response accuracy. The model was trained using the Megatron-LM library, a component of the NVIDIA NeMo platform, leveraging 3072 NVIDIA H100 accelerators based on DGX Cloud. This setup reportedly allows the Mistral NeMo 12B to excel in multi-pass dialogues, mathematical problems, programming, and more, demonstrating “common sense” and “world knowledge.” The model delivers accurate and reliable performance across a wide range of applications, notes NIX Solutions.
Deployment and Advantages
Released under the Apache 2.0 license, the Mistral NeMo 12B is offered as a NIM container. Its creators claim that implementing this LLM takes minutes rather than days. The model can run efficiently on a single NVIDIA L40S accelerator, GeForce RTX 4090, or RTX 4500. Key advantages of deployment via NIM include high efficiency, low computing costs, and enhanced security and privacy. We’ll keep you updated on any further developments.
Overall, the Mistral NeMo 12B represents a significant advancement in AI technology, providing businesses with a powerful tool to streamline operations and enhance productivity.