Llemma is Here, An Open Language Model For Mathematics

Researchers from EleutherAI have introduced Llemma, an open language model designed for mathematics, along with a Proof-Pile-2 dataset. This project, which is built with continuous pretraining of CodeLlama, has garnered significant attention in the academic and research community.

Check out the GitHub repository here.

Llemma stands out by offering both 7 billion and 34 billion parameter models, surpassing the capabilities of all other open base models, including Google’s Minerva, even at similar model scales. The achievement is particularly noteworthy as the 34-billion parameter Llemma model approaches the performance of Google’s Minerva, which boasts 62 billion parameters, despite having just half the parameters.

We release Llemma: open LMs for math trained on up to 200B tokens of mathematical text.

The performance of Llemma 34B approaches Google’s Minerva 62B despite having half the parameters.

Models/data/code: https://t.co/zFvKHrK7t3
Paper: https://t.co/gGgyFQX8sA

More pic.twitter.com/K7ZiG9n8BT

— Zhangir Azerbayev (@zhangir_azerbay) October 17, 2023

This new development from EleutherAI not only parallels Minerva, a closed model specially designed for mathematics by Google Research but also manages to exceed Minerva’s problem-solving capabilities on an equi-parameter basis. Notably, Llemma’s capabilities extend to a broader spectrum of tasks, including tool use and formal mathematics, which further distinguishes it in the realm of mathematical language modeling.

Zhangir Azerbayev, the lead author of the paper, describes that the journey toward creating Llemma began with the assembly of a vast dataset of mathematical tokens, encompassing the ArXiv subset of RedPajama, the recent OpenWebMath dataset, and the introduction of the AlgebraicStack, a code dataset tailored specifically for mathematics. This comprehensive approach resulted in training on an astounding 55 billion unique tokens.

Llemma’s models were initialized with Code Llama weights and subsequently trained across a network of 256 A100 GPUs on StabilityAI‘s Ezra cluster. The 7-billion model underwent extensive training, spanning 200 billion tokens and 23,000 A100 hours, while the 34-billion model received 50 billion tokens of training over 47,000 A100 hours.

In addition to its exceptional performance on chain-of-thought tasks when compared on an equal-parameter basis with Minerva, Llemma benefits from majority voting, providing an extra boost to its performance.

The collaborative effort of institutions such as Princeton University, EleutherAI, University of Toronto, Vector Institute, University of Cambridge, Carnegie Mellon University, and University of Washington has culminated in the creation of Llemma.

The post Llemma is Here, An Open Language Model For Mathematics appeared first on Analytics India Magazine.

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

Previous News

Zygon helps startups avoid data breaches from SaaS providers

Next News

Indonesia’s Julo dives into education loans as others exit the space

Editorial Team

StartupNews.fyi is a leading global startup and technology media platform known for its end-to-end coverage of the startup ecosystem across India and key international markets. Launched with the vision of becoming a single gateway for founders, investors, and ecosystem enablers, StartupNews.fyi has grown steadily over the years by publishing tens of thousands of verified news stories, insights, and ecosystem updates, reaching millions of startup enthusiasts every month through its digital platforms and communities.

More like this

Llemma is Here, An Open Language Model For Mathematics

Disclaimer

Popular

Galaxy S22 boot loop complaints surge after the February update

Road To Viksit Bharat: Long Term Policy Planning To Woo Pvt Investors With Focus On Exports, Jobs

IoT platform DATOMS raises Rs 25 crore funding led by Big Capital JSC

Robotaxis are coming to London. The city’s famed black cab drivers are skeptical

US Under Secretary Jacob Helberg warns of ‘weaponised dependency’ as India set to join Pax Silica

More Like this

DES PU’s Unique Programmes and Student-Centric Learning Approach is Empowering Creativity and Careers

OpenAI deepens partnerships with consulting giants to push enterprise AI beyond pilot

IoT platform DATOMS raises Rs 25 crore funding led by Big Capital JSC

GDP ill-suited for the age of AI, Sam Altman tells Vinod Khosla

FSSAI picks Silver Touch to build and run its new digital systems in long-term tech deal

DoT Orders Noida Airport to Support Telecom Network Installation

Llemma is Here, An Open Language Model For Mathematics

Disclaimer

More like this

DES PU’s Unique Programmes and Student-Centric Learning Approach is...

OpenAI deepens partnerships with consulting giants to push enterprise...

IoT platform DATOMS raises Rs 25 crore funding led...

Popular

Block title

Twilio’s A2H is a new protocol that helps agents talk to humans

NVT Quality Lifestyle forays into Sky Villas and Large-Format Integrated Townships

What is Pax Silica, the US-led AI, supply chain bloc India formally joined today...

Mark Zuckerberg grilled at trial over social media addiction, says ‘difficult’ to enforce Instagram...

Google Offers AI Certificate Free For Eligible U.S. Small Businesses

NASA Chief Classifies Starliner Flight As ‘Type A’ Mishap, Says Agency Made Mistakes

Nvidia sells off final Arm shares, but licensing deals will continue — $140 million...

Startup Events

Trending News

DES PU’s Unique Programmes and Student-Centric Learning Approach is Empowering Creativity and Careers

OpenAI deepens partnerships with consulting giants to push enterprise AI beyond pilot

IoT platform DATOMS raises Rs 25 crore funding led by Big Capital JSC

GDP ill-suited for the age of AI, Sam Altman tells Vinod Khosla

FSSAI picks Silver Touch to build and run its new digital systems in long-term tech deal

About

Partnership

Contact us