In a significant development for AI technology, Mistral AI has launched NeMo, a 12-billion parameter model, in collaboration with NVIDIA. This new model boasts an impressive 128,000-token context window, enhancing its capabilities in reasoning, coding accuracy, and world knowledge. This partnership aims to provide a formidable tool for multilingual applications and ease of use for existing systems, marking a notable milestone in AI advancements.
Enhanced Multilingual Capabilities
Earlier reports on Mistral AI and NVIDIA’s initiatives highlighted their focus on developing versatile AI tools. The current model, NeMo, builds on these efforts by offering a significantly larger context window and enhanced performance metrics. The previous models like Mistral 7B had limitations in context windows and multilingual efficiency, which the new NeMo model has surpassed. This advancement signifies a step forward in creating more adaptable and powerful AI models for diverse applications.
The introduction of Tekken, a new tokeniser based on Tiktoken, trained on over 100 languages, distinguishes NeMo further. Previous models relied on the SentencePiece tokeniser, but Tekken’s improved compression efficiency, especially in Korean and Arabic, elevates NeMo’s utility. This improvement is pivotal as it offers about 30% better compression for source code and major languages, potentially outperforming competitors like Llama 3 in multilingual scenarios.
Open-Source Adoption
Mistral AI has made both pre-trained base and instruction-tuned checkpoints available under the Apache 2.0 license to encourage broader adoption and research. This open-source strategy may attract researchers and enterprises, potentially accelerating the integration of NeMo into various applications. The quantisation awareness during training ensures that organisations can deploy large language models efficiently without sacrificing performance.
Mistral NeMo’s seamless compatibility with existing systems using standard architectures, such as Mistral 7B, simplifies its adoption. Its availability on HuggingFace and packaging as an NVIDIA NIM inference microservice makes it accessible to developers and organisations invested in NVIDIA’s AI ecosystem. This seamless integration could facilitate easier deployment and experimentation, thus broadening its reach and application.
The emphasis on global, multilingual applications is evident with NeMo’s design, which supports function calling and has strengths in various languages, including English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. This positions NeMo as a versatile tool for a wide range of AI-driven applications, furthering Mistral AI and NVIDIA’s goal of bringing advanced AI models to a global audience.