Nexusflow.ai, has recently launched NexusRaven-V2, a powerful 13-billion parameter LLM that outperforms GPT-4 in zero-shot function calling. The open source model showcases a remarkable capability to transform natural language instructions into executable code, facilitating the utilisation of software tools by copilots and agents.
NexusRaven-V2 demonstrates superiority over GPT-4 by achieving up to a 7% higher success rate in function calling in human-generated use cases involving nested and composite functions. Notably, NexusRaven-V2 accomplishes this without prior training on the specific functions used in the evaluation.
Check out the model on GitHub here, and on Hugging Face here.
Nexusflow.ai introduces the Nexus-Function-Calling benchmark, establishing a Hugging Face leaderboard. This includes a diverse collection of real-life human-curated function-calling examples, with eight out of the nine benchmarks open-sourced.
Built on top of Llama 2, leveraging CodeLlama-13B-instruct, NexusRaven-V2 is instruction-tuned and utilises curated data from Nexusflow’s pipeline. The model is commercially permissive, encouraging both community developers and enterprises to explore its capabilities.
Nexusflow.ai provides open-source utility artefacts, enabling users to seamlessly replace mainstream proprietary function calling APIs with NexusRaven-V2 in their software workflows. Online demos and Colab notebooks are also available for onboarding and integration demonstrations.
NexusRaven-V2 showcases a 4% higher success rate in function calling on average compared to the latest GPT-4 model, as observed in a human-curated benchmark. In tasks involving nested and composite function calls, NexusRaven-V2 exhibits a significant 7% advantage over GPT-4, highlighting its robustness in handling variations in developers’ descriptions of functions.
To ensure reproducibility and standardisation, Nexusflow.ai releases the benchmark and associated leaderboard along with model weights. The evaluation benchmark prioritises human-generated samples with meticulous checks on executability and encompasses a diverse representation of function calling use cases and difficulties.
Nexusflow.ai is also providing a Python package, “nexusraven,” facilitating easy integration with copilots or agents. Developers can quickly ingest API function descriptions and send natural language queries to the model with a single line of code. The nexusraven package also supports converting function calling code to JSON format for seamless integration with downstream software.
The post NexusRaven Outperforms GPT-4 for Zero-shot Function Calling appeared first on Analytics India Magazine.