xAI’s Grok-2 Ranks Second on the Chatbot Arena Leaderboard, Competing with Gemini 1.5 and GPT-4o

Share via:


In an exciting development from the xAI team, Grok-2 and Grok-Mini have officially secured positions on the LMSys Chatbot Arena leaderboard. Grok-2 has taken the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini model, driven by over 6,000 community votes. 

Meanwhile, Grok-2-Mini has earned the #5 position.

Grok-2 has excelled particularly in mathematical tasks, ranking #1 in this category, and secured the #2 positions across various other tasks, including hard prompts, coding, and instruction-following. 

Additionally, Grok-2-Mini has undergone significant speed enhancements, now performing twice as fast as before. This boost was achieved after xAI’s inference team as they completely rewrote the inference stack using SGLang, enabling more efficient multi-host inference and improved accuracy.

The team also introduced new algorithms for computation and communication kernels, alongside better batch scheduling and quantisation, further enhancing the models’ performance.

Several people are still sceptical about the performance. OpenAI’s GPT-4o, which claims the top spot, does not perform as well as Claude 3.5, which is at the 5th spot. Though, people have started experimenting with Grok-2 and claim that the model is actually brilliant in coding and maths related tasks.

Released in Beta this month, the Grok-2 family of models are also available for testing on X. The model also allows users to generate images using the FLUX.1 image generation model.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

xAI’s Grok-2 Ranks Second on the Chatbot Arena Leaderboard, Competing with Gemini 1.5 and GPT-4o


In an exciting development from the xAI team, Grok-2 and Grok-Mini have officially secured positions on the LMSys Chatbot Arena leaderboard. Grok-2 has taken the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini model, driven by over 6,000 community votes. 

Meanwhile, Grok-2-Mini has earned the #5 position.

Grok-2 has excelled particularly in mathematical tasks, ranking #1 in this category, and secured the #2 positions across various other tasks, including hard prompts, coding, and instruction-following. 

Additionally, Grok-2-Mini has undergone significant speed enhancements, now performing twice as fast as before. This boost was achieved after xAI’s inference team as they completely rewrote the inference stack using SGLang, enabling more efficient multi-host inference and improved accuracy.

The team also introduced new algorithms for computation and communication kernels, alongside better batch scheduling and quantisation, further enhancing the models’ performance.

Several people are still sceptical about the performance. OpenAI’s GPT-4o, which claims the top spot, does not perform as well as Claude 3.5, which is at the 5th spot. Though, people have started experimenting with Grok-2 and claim that the model is actually brilliant in coding and maths related tasks.

Released in Beta this month, the Grok-2 family of models are also available for testing on X. The model also allows users to generate images using the FLUX.1 image generation model.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

Chinese Tether laundromat, Bhutan enjoys recent Bitcoin boost: Asia...

Tether launderers sentenced as Bhutan’s Bitcoin hodling places...

First iPhone 16 pre-orders arrive as lines form at...

As the clock turns to September 20 around...

OpenAI o1 “Strawberry” Finally Available on GitHub Copilot Chat...

GitHub is not sitting quiet ever since the...

Popular

Upcoming Events

Startup Information that matters. Get in your inbox Daily!