Microsoft’s 1.3 Billion Model Outperforms Llama 2

Microsoft Research has done it once again. After outperforming Meta’s LLaMa with phi-1 in July, the researchers have now introduced phi-1.5, a cutting-edge language model of 1.3 billion parameters that outperforms Llama 2’s 7 billion parameters model on several benchmarks. Microsoft has decided to open source the model.

The phi-1.5 model, comprising a staggering 1.3 billion parameters, has been meticulously crafted to excel in multiple domains, making it the go-to choice for a wide range of applications. It particularly shines when dealing with queries in the question-answering (QA) format, as well as in chat interactions and code-related tasks.

Click here to check out the open source model on Hugging Face.

How far does one billion parameters take you? As it turns out, pretty far!!!

Today we’re releasing phi-1.5, a 1.3B parameter LLM exhibiting emergent behaviors surprisingly close to much larger LLMs.

For warm-up, see an example completion w. comparison to Falcon 7B & Llama2-7B pic.twitter.com/x5qZGPjoSZ

— Sebastien Bubeck (@SebastienBubeck) September 12, 2023

While phi-1 was trained on high-quality textbook data, phi-1.5 is trained on synthetic data only. This sets phi-1.5 apart is its comprehensive training regimen, encompassing a rich tapestry of data sources. The model’s learning journey draws from diverse data pools, including Python code snippets harvested from StackOverflow, code from competitive programming contests, synthetic Python textbooks, and exercises generated by the powerful gpt-3.5-turbo-0301.

Click here to read the paper: Textbooks Are All You Need II: phi-1.5 technical report

Key Details of phi-1.5 Model:

Architecture: Transformer-based model with a focus on next-word prediction objectives.

Dataset Size: Trained on a vast corpus of 30 billion tokens.

Training Tokens: The model honed its skills on a staggering 150 billion tokens.

Precision: Utilises the fp16 precision standard.

GPUs: Harnesses the power of 32xA100-40G GPUs.

Training Time: Achieved its remarkable capabilities through 8 days of intensive training.

The brainpower behind phi-1.5, the Microsoft Research team, asserts that this model has achieved nearly state-of-the-art performance levels among models with less than 10 billion parameters. Rigorous benchmark tests evaluating common sense, language comprehension, and logical reasoning have positioned phi-1.5 as a formidable contender.

Notably, phi-1.5 has outperformed Meta’s Llama-2 7b in the AGIEval score and has approached parity with llama-2 7b in the GPT4ALL’s Benchmark suite, as measured by the LM-Eval Harness.

The post Microsoft’s 1.3 Billion Model Outperforms Llama 2 appeared first on Analytics India Magazine.

Previous News

Save $349 on Apple’s M1 MacBook Air with 16GB of RAM, Belkin 15W MagSafe stand $120, more

Next News

Last chance to lock in your iPhone trade-in value before the iPhone 15 launch

Key Details of phi-1.5 Model:

Disclaimer

Popular

Microsoft to Introduce Voice Reporting Feature for Xbox

Adobe teams up with India’s Education Ministry for creative learning initiative

Meta May Allow Instagram and Facebook Users in Europe to Pay to Avoid Ads

Indian fintechs amplify payments soundbox pitches to woo merchants

Fintech Unicorn Pine Labs Launches Mini — A QR-First Device With Card Support

More Like this

Epigamia Cofounder Rohan Mirchandani Passes Away

Top-Level VC Exits That Defined The Year Of Startup Resurgence

Quantum computing will fortify Bitcoin signatures: Adam Back

New-Age Tech Stocks Bleed Amid Broader Market Slump

UAE-based Web3 banking startup raises $25m series A

iOS 18.2.1 coming soon for iPhone users

Microsoft’s 1.3 Billion Model Outperforms Llama 2

Key Details of phi-1.5 Model:

Disclaimer

More like this

Epigamia Cofounder Rohan Mirchandani Passes Away

Top-Level VC Exits That Defined The Year Of Startup...

Quantum computing will fortify Bitcoin signatures: Adam Back

Popular

CRED Forays Into Insurance Vertical

Seizing A Trillion-Dollar Opportunity By 2030

Prediction markets are not being manipulated — Kalshi founder

8i Ventures Exits M2P Fintech With 12X Returns

US has 26M strong ‘crypto voting bloc’ ahead of elections — Survey

Elon Musk’s X is changing its privacy policy to allow third parties to train...

59 Cleantech Startups Working Towards Making India Greener

Upcoming Events

Odisha Startup Carnival (OSC) 2024 | Odisha | December 7-21

Bharat Business Awards 2024 | Goa | December 21

Founders Meetup | Hyderabad | December 21

The Founders Network | Bengaluru | December 21

Growth Meetup | Ahmedabad | December 22

StartupNews.fyi

StartupNews.fyi

Microsoft’s 1.3 Billion Model Outperforms Llama 2

Key Details of phi-1.5 Model:

Disclaimer

Popular

More Like this

Microsoft’s 1.3 Billion Model Outperforms Llama 2

Key Details of phi-1.5 Model:

Disclaimer

More like this

Popular

Upcoming Events

Newsletter Signup Form!

Newsletter Signup Form!