OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE

A team of researchers from the Allen Institute for AI, Contextual AI, and the University of Washington have released OLMoE (Open Mixture-of-Experts Language Models), a new open-source LLM that achieves state-of-the-art performance while using significantly fewer computational resources than comparable models.

OLMoE utilizes a Mixture-of-Experts (MoE) architecture, allowing it to have 7 billion total parameters but only activate 1.3 billion for each input. This enables OLMOE to match or exceed the performance of much larger models like Llama2-13B while using far less compute power during inference.

Thanks to Mixture-of-Experts, better data & hyperparams, OLMoE is much more efficient than OLMo 7B as it uses 4x less training FLOPs and 5x less parameters were used per forward pass for cheaper training and cheaper inference.

Importantly, the researchers have open-sourced not just the model weights, but also the training data, code, and logs. This level of transparency is rare for high-performing language models and will allow other researchers to build upon and improve OLMOE.

For example, on the MMLU benchmark, OLMOE-1B-7B achieves a score of 54.1%, surpassing models like OLMo-7B (54.9%) and Llama2-7B (46.2%) despite using significantly fewer active parameters. After instruction tuning, OLMOE-1B-7B-INSTRUCT even outperforms larger models like Llama2-13B-Chat on benchmarks such as AlpacaEval.

This demonstrates the effectiveness of OLMOE’s Mixture-of-Experts architecture in achieving high performance with lower computational requirements.

Additionally, OLMOE-1B-7B stands out for its full open-source release, including model weights, training data, code, and logs, making it a valuable resource for researchers and developers looking to build upon and improve state-of-the-art language models.

MoE is a preferred choice when you don’t have enough resources to build your own model from scratch and merge multiple small models of different expertise to have one single model that does it all without much cost and training.

The post OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE appeared first on AIM.

Source link

Previous News

US, UK and EU sign on to the Council of Europe’s high-level AI safety treaty

Next News

‘Find My’ is finally coming to Korea, with spring 2025 launch

Disclaimer

Popular

Microsoft to Introduce Voice Reporting Feature for Xbox

Adobe teams up with India’s Education Ministry for creative learning initiative

Meta May Allow Instagram and Facebook Users in Europe to Pay to Avoid Ads

Indian fintechs amplify payments soundbox pitches to woo merchants

Fintech Unicorn Pine Labs Launches Mini — A QR-First Device With Card Support

More Like this

The iPhone 16 launches today, without its most hyped feature: Apple Intelligence

Capital A Launches INR 400 Cr Fund II, Eyes Investments In 20 Startups

M&As and AI are in the spotlight, but there’s still capital left for quick commerce and more

Moglix Infuses $50 Mn In Its Financing Arm Credlix

Researcher reveals ‘catastrophic’ security flaw in the Arc browser

Aethir partners with Filecoin to solve GPU shortage, boost AI

OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE

Disclaimer

More like this

The iPhone 16 launches today, without its most hyped...

Capital A Launches INR 400 Cr Fund II, Eyes...

M&As and AI are in the spotlight, but there’s...

Popular

The Tech Outage That Threw ChatGPT Out Of Gear

Apple releases new firmware version for AirPods Pro 2 and AirPods 4

Railways Developing A Super App: Ashwini Vaishnaw

Moneyboxx To Raise INR 176 Cr To Expand Its Lending Play

Wealthtech Centricity Bags $20 Mn To Build GenAI Modules

MCA Exempts Startups Looking To Reverse Flip From NCLT Nod

iPhone users can stay on iOS 17 and get security patches

Upcoming Events

Fintech Revolution Summit | Jakarta | October 24

Token 2049 | Singapore | Sept 18-19

Startup Meetup (RTF) | Gurugram | September 20

Future Mobility Summit | New Delhi | September 20

Earthcon Expo | Hyderabad | September 20-22

StartupNews.fyi

StartupNews.fyi

OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE

Disclaimer

Popular

More Like this

OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE

Disclaimer

More like this

Popular

Upcoming Events

Newsletter Signup Form!

Newsletter Signup Form!