Claude 3.5 Sonnet Outshines GPT-4o in Data Visualisation

Share via:


Since the release of Anthropic’s Claude 3.5 model family, social media platforms, particularly X, have been all about Claude 3.5 Sonnet. Of all its new features, Artifacts is one of the most talked about and makes it far better than OpenAI’s GPT-4o. 

It enhances user interaction by providing a dedicated window alongside conversations. It also significantly improves data interpretation and visualisation capabilities, making it easier for users to interact with and understand the generated content.

Claude 3.5 Sonnet Rules

Recently, Swami Sivasubramanian, head of AI services and data at AWS, also spoke about the feature, highlighting Claude 3.5 Sonnet’s strengths for data science and analysis, alongside vision capabilities. 

He said, when given access to a coding environment, it produces high-quality statistical visualisation and actionable predictions, ranging from business strategies to real-time product trends. 

Further, he said when processing images, particularly interpreting charts and graphs that require visual understanding, Claude 3.5 Sonnet does a pretty good job. 

“It can accurately transcribe text from imperfect images—a core capability for industries such as retail, logistics, healthcare, and financial services, where AI may be able to garner more insights from an image, graphic or illustration than from text alone, for use cases like trend analysis, patient triage, and research summaries,” added Sivasubramanian. 

AI researcher Razia Aliani also recently experimented by turning research papers into actionable insights, alongside identifying key concepts, visualising relationships, and extracting relevant data. “I made it possible with this AI agent (Claude 3.5 Sonnet). It turns information overload into actionable insights,” she added. 

The examples are plenty:

https://x.com/TheAIAdvantage/status/1809236767951708204 

What About GPT 4o?

Users on X have praised GPT 4o’s data visualisation capabilities. For instance, Aadit Sheth posted saying, he took less than 30 seconds to create high quality graphs. 

In a Reddit post users are sharing their experience on how they absolutely loved the GPT 4o data visualisation capabilities. A user also mentioned how the data visualisation capability works within the same chat session and provides relevant prompt suggestions after each reply. 

Even though the users praised the GPT 4o visualisation capabilities, there were limitations mentioned as well.

A user noted that while the feature is available, it’s not always reliable for complex data analysis and visualisation tasks, especially when compared to specialised tools like R or other plotting software. 

Both Struggle 

YouTube presenter Jordan Wilson compared how Claude 3 and GPT-4 fared at performing data analysis on YouTube channel statistics.

https://www.youtube.com/watch?v=i0qzy5I9Lf0

The analysis happened with the data set containing 16,000 cells of information from Wilson’s YouTube channel, including metrics for about 500 videos.

Both AI models were tasked with analysing various aspects of the channel’s performance, such as optimal publish times, content types, and top-performing videos.

Claude and GPT-4 both showed capabilities in data analysis, with some strengths and weaknesses for each. Claude provided more creative and analytical insights for future video strategies, whereas, GPT-4 offered interactive charts and more detailed explanations of its analysis process. 

However, both models encountered a few errors or limitations, particularly with complex visualisation requests. 

For example, Claude initially had issues with its artifacts feature, requiring a second attempt. Also, GPT-4 faced limitations with certain types of interactive charts, stating “interactive charts of this type are not supported” for some requests. 

Furthermore, a research paper by Generative AI Research Lab, showed that in an overall comparison between the two, GPT-4o slightly outperforms Claude-3.5-Sonnet in overall visual reasoning tasks, but the difference is minimal.

Source – Research paper

Benchmarking for Data Interpretation And Visualisation 

When it comes to data visualisation, there is no established benchmark for evaluation. However, some research papers, such as VisEval, have specifically developed a benchmark for data evaluation in LLMs.

Some findings from the paper indicate that LLMs struggle with complex visualisations requiring multiple visual channels and that performance decreases with increasing query complexity.

It is also possible that due to lack of specific benchmarks for data visualisation alone, it has not been considered as a factor or rather not taken into account in the Claude 3.5 evaluation research paper.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

Claude 3.5 Sonnet Outshines GPT-4o in Data Visualisation


Since the release of Anthropic’s Claude 3.5 model family, social media platforms, particularly X, have been all about Claude 3.5 Sonnet. Of all its new features, Artifacts is one of the most talked about and makes it far better than OpenAI’s GPT-4o. 

It enhances user interaction by providing a dedicated window alongside conversations. It also significantly improves data interpretation and visualisation capabilities, making it easier for users to interact with and understand the generated content.

Claude 3.5 Sonnet Rules

Recently, Swami Sivasubramanian, head of AI services and data at AWS, also spoke about the feature, highlighting Claude 3.5 Sonnet’s strengths for data science and analysis, alongside vision capabilities. 

He said, when given access to a coding environment, it produces high-quality statistical visualisation and actionable predictions, ranging from business strategies to real-time product trends. 

Further, he said when processing images, particularly interpreting charts and graphs that require visual understanding, Claude 3.5 Sonnet does a pretty good job. 

“It can accurately transcribe text from imperfect images—a core capability for industries such as retail, logistics, healthcare, and financial services, where AI may be able to garner more insights from an image, graphic or illustration than from text alone, for use cases like trend analysis, patient triage, and research summaries,” added Sivasubramanian. 

AI researcher Razia Aliani also recently experimented by turning research papers into actionable insights, alongside identifying key concepts, visualising relationships, and extracting relevant data. “I made it possible with this AI agent (Claude 3.5 Sonnet). It turns information overload into actionable insights,” she added. 

The examples are plenty:

https://x.com/TheAIAdvantage/status/1809236767951708204 

What About GPT 4o?

Users on X have praised GPT 4o’s data visualisation capabilities. For instance, Aadit Sheth posted saying, he took less than 30 seconds to create high quality graphs. 

In a Reddit post users are sharing their experience on how they absolutely loved the GPT 4o data visualisation capabilities. A user also mentioned how the data visualisation capability works within the same chat session and provides relevant prompt suggestions after each reply. 

Even though the users praised the GPT 4o visualisation capabilities, there were limitations mentioned as well.

A user noted that while the feature is available, it’s not always reliable for complex data analysis and visualisation tasks, especially when compared to specialised tools like R or other plotting software. 

Both Struggle 

YouTube presenter Jordan Wilson compared how Claude 3 and GPT-4 fared at performing data analysis on YouTube channel statistics.

https://www.youtube.com/watch?v=i0qzy5I9Lf0

The analysis happened with the data set containing 16,000 cells of information from Wilson’s YouTube channel, including metrics for about 500 videos.

Both AI models were tasked with analysing various aspects of the channel’s performance, such as optimal publish times, content types, and top-performing videos.

Claude and GPT-4 both showed capabilities in data analysis, with some strengths and weaknesses for each. Claude provided more creative and analytical insights for future video strategies, whereas, GPT-4 offered interactive charts and more detailed explanations of its analysis process. 

However, both models encountered a few errors or limitations, particularly with complex visualisation requests. 

For example, Claude initially had issues with its artifacts feature, requiring a second attempt. Also, GPT-4 faced limitations with certain types of interactive charts, stating “interactive charts of this type are not supported” for some requests. 

Furthermore, a research paper by Generative AI Research Lab, showed that in an overall comparison between the two, GPT-4o slightly outperforms Claude-3.5-Sonnet in overall visual reasoning tasks, but the difference is minimal.

Source – Research paper

Benchmarking for Data Interpretation And Visualisation 

When it comes to data visualisation, there is no established benchmark for evaluation. However, some research papers, such as VisEval, have specifically developed a benchmark for data evaluation in LLMs.

Some findings from the paper indicate that LLMs struggle with complex visualisations requiring multiple visual channels and that performance decreases with increasing query complexity.

It is also possible that due to lack of specific benchmarks for data visualisation alone, it has not been considered as a factor or rather not taken into account in the Claude 3.5 evaluation research paper.





Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

Chinese Tether laundromat, Bhutan enjoys recent Bitcoin boost: Asia...

Tether launderers sentenced as Bhutan’s Bitcoin hodling places...

Apple Fifth Ave glows in colors to celebrate iPhone...

iPhone 16 is almost here. Following the start...

Elon Musk’s reposts of Kamala Harris deepfakes may not...

California’s newest law could land social media users...

Popular

Upcoming Events

Startup Information that matters. Get in your inbox Daily!