G42 said that NANDA is a 13 Bn parameter model trained on approximately 2.13 Tn tokens of language datasets, including Hindi
With this, the UAE company bids to be another name in the Hindi LLM space besides homegrown AI startup Sarvam AI’s Hindi LLM ‘OpenHathi-Hi-v0.1’
G42 launched its open-source Arabic LLM ‘JAIS’ in August 2023 and claims that it transformed Arabic Natural Language Processing (NLP)
Abu Dhabi-based artificial intelligence (AI) and cloud computing company G42 has unveiled a Hindi large language model (LLM), NANDA. The LLM will be launched soon.
The Microsoft-backed company introduced NANDA in New Delhi in the presence of Abu Dhabi’s Crown Prince Zayed Al Nahyan during his state visit to India.
In a statement, G42 said that NANDA is a 13 Bn parameter model trained on approximately 2.13 Tn tokens of language datasets, including Hindi. It said that the release of the LLM model will allow Hindi speakers to harness the potential of generative AI and empower India’s scientific, academic, and developer communities.
The model is being trained on Condor Galaxy, an AI supercomputer developed by the US-based Cerebras Systems and G42. Its development saw G42’s subsidiary Inception collaborate with Mohamed bin Zayed University of Artificial Intelligence and Cerebras Systems.
“With NANDA, we are heralding a new era of AI inclusivity, ensuring that the rich heritage and depth of Hindi language is represented in the digital and AI landscape,” Inception’s acting CEO Andrew Jackson said.
It is pertinent to note that G42 launched its open-source Arabic LLM ‘JAIS’ in August 2023.
The company claims that JAIS transformed Arabic Natural Language Processing (NLP), and gave access to native language GenAI capabilities to over 400 Mn Arabic speakers globally.
“With models ranging from 590 Mn to 70 Bn parameters, JAIS set a new standard for linguistic AI which G42 now seeks to replicate for other regions whose languages are still underrepresented,” the company said in a statement.
With this, the UAE company bids to be another name in the Hindi LLM space. In December 2023, homegrown AI startup Sarvam AI launched OpenHathi-Hi-v0.1, its first Hindi LLM. The LLM, built on Meta AI’s Llama2-7B architecture, delivers performance on par with GPT-3.5 for Indic languages, Sarvam claimed at the time of its launch.
At the heart of the Indianisation push is the vast opportunity which the Indian market presents. According to an Inc42 analysis, India’s GenAI market is projected to surpass $17 Bn mark by 2030 from $1.1 Bn in 2023, clocking a CAGR of 48%. India is currently home to more than 100 AI startups.
However, only a handful are in the business of developing and deploying LLMs. Besides Sarvam’s OpenHathi, the list of Indian LLMs includes names like Krutrim’s Krutrim Pro, Kissan AI’s Dhenu, CoRover.ai’s BharatGPT. Besides, conglomerates Reliance and Tata have also partnered with global chip major NVIDIA to build their respective Hindi LLMs.