Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math

Share via:


A day after Meta released Llama 3.1, Mistral AI has announced Mistral Large 2, the latest generation of its flagship model, offering substantial improvements in code generation, mathematics, and multilingual support. The model introduces advanced function-calling capabilities and is available on la Plateforme.

With a 128k context window and support for dozens of languages, including French, German, Spanish, and Chinese, Mistral Large 2 aims to cater to diverse linguistic needs. It also supports 80+ coding languages, such as Python, Java, and C++. The model is designed for single-node inference and long-context applications, boasting 123 billion parameters.

Mistral Large 2 is released under the Mistral Research License for research and non-commercial use. It achieves 84.0% accuracy on the MMLU benchmark, setting a new standard for performance and cost efficiency in open models. In code generation and reasoning, it competes with leading models like GPT-4o and Llama 3.

Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math

The model’s training focused on reducing hallucinations and ensuring accurate outputs, significantly enhancing its reasoning and problem-solving skills. Mistral Large 2 is trained to acknowledge its limitations in providing solutions, reflecting its commitment to accuracy.

Improvements in instruction-following and conversational capabilities are evident, with the model excelling in benchmarks such as MT-Bench, Wild Bench, and Arena Hard. Mistral AI emphasizes concise responses, vital for business applications.

Mistral Large 2’s multilingual proficiency includes languages like Russian, Japanese, and Arabic, performing strongly on the multilingual MMLU benchmark. It also features enhanced function calling skills, making it suitable for complex business applications.

Users can access Mistral Large 2 via la Plateforme under the name mistral-large-2407. Mistral AI is consolidating its offerings, including general-purpose models Mistral Nemo and Mistral Large, and specialist models Codestral and Embed. Fine-tuning capabilities are now extended to these models.

The model is available through partnerships with Google Cloud Platform, Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. This expansion aims to bring Mistral AI’s advanced models to a global audience, enhancing accessibility and application development.

Mistral Large 2 is the fourth model from the company in the past week, following the release of MathΣtral, a specialized 7B model designed for advanced mathematical reasoning and scientific exploration. 

The company also released Codestral Mamba 7B, based on the advanced Mamba 2 architecture, which is trained with a context length of 256k tokens and built for code generation tasks for developers worldwide. Additionally, Mistral AI introduced Mistral NeMo, a 12-billion parameter model with a 128k token context length, developed in partnership with NVIDIA.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math


A day after Meta released Llama 3.1, Mistral AI has announced Mistral Large 2, the latest generation of its flagship model, offering substantial improvements in code generation, mathematics, and multilingual support. The model introduces advanced function-calling capabilities and is available on la Plateforme.

With a 128k context window and support for dozens of languages, including French, German, Spanish, and Chinese, Mistral Large 2 aims to cater to diverse linguistic needs. It also supports 80+ coding languages, such as Python, Java, and C++. The model is designed for single-node inference and long-context applications, boasting 123 billion parameters.

Mistral Large 2 is released under the Mistral Research License for research and non-commercial use. It achieves 84.0% accuracy on the MMLU benchmark, setting a new standard for performance and cost efficiency in open models. In code generation and reasoning, it competes with leading models like GPT-4o and Llama 3.

Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math

The model’s training focused on reducing hallucinations and ensuring accurate outputs, significantly enhancing its reasoning and problem-solving skills. Mistral Large 2 is trained to acknowledge its limitations in providing solutions, reflecting its commitment to accuracy.

Improvements in instruction-following and conversational capabilities are evident, with the model excelling in benchmarks such as MT-Bench, Wild Bench, and Arena Hard. Mistral AI emphasizes concise responses, vital for business applications.

Mistral Large 2’s multilingual proficiency includes languages like Russian, Japanese, and Arabic, performing strongly on the multilingual MMLU benchmark. It also features enhanced function calling skills, making it suitable for complex business applications.

Users can access Mistral Large 2 via la Plateforme under the name mistral-large-2407. Mistral AI is consolidating its offerings, including general-purpose models Mistral Nemo and Mistral Large, and specialist models Codestral and Embed. Fine-tuning capabilities are now extended to these models.

The model is available through partnerships with Google Cloud Platform, Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. This expansion aims to bring Mistral AI’s advanced models to a global audience, enhancing accessibility and application development.

Mistral Large 2 is the fourth model from the company in the past week, following the release of MathΣtral, a specialized 7B model designed for advanced mathematical reasoning and scientific exploration. 

The company also released Codestral Mamba 7B, based on the advanced Mamba 2 architecture, which is trained with a context length of 256k tokens and built for code generation tasks for developers worldwide. Additionally, Mistral AI introduced Mistral NeMo, a 12-billion parameter model with a 128k token context length, developed in partnership with NVIDIA.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

Fisker Ocean owners stuck paying for recall repairs

EV startup Fisker is about to enter the...

Govt To Safeguard Retailers In Case Of Predatory Pricing:...

SUMMARY Important to take care of small traders and...

Runway announces an API for its video-generating AI models

Runway, one of several AI startups developing video-generating...

Popular

Upcoming Events

Startup Information that matters. Get in your inbox Daily!