Stability AI Japan has recently released two Japanese language models, namely Japanese Stable LM 3B-4E1T and Japanese Stable LM Gamma 7B. The former boasts approximately 3 billion parameters, while the latter is a 7 billion parameters model. These models have been made available under the Apache 2.0 license for commercial use.
These models are built upon previously released English language models, specifically Stable LM 3B-4E1T and Mistral-7B-v0.1, published by Stability AI in August and Mistral AI in September 2023, respectively. These models were initially trained with predominantly English data, resulting in high proficiency in English but limited Japanese language capabilities due to the scarcity of Japanese data.
To enhance their Japanese language abilities, these models underwent continued pretraining, utilizing Japanese and English datasets from sources like Wikipedia, mC4, CC-100, OSCAR, and SlimPajama (excluding Books3), amounting to approximately 100 billion tokens.
The performance evaluation of these models followed the same methodology as the one used for Japanese Stable LM Alpha, released in August 2023. The evaluation included Japanese language understanding benchmarks (JGLUE) tasks, encompassing tasks such as sentence classification, sentence pair classification, question answering, and text summarisation, totalling eight tasks.
The Japanese Stable LM 3B-4E1T demonstrated superior performance compared to the Japanese Stable LM Base Alpha 7B, despite having only 3 billion parameters. Japanese Stable LM Gamma 7B achieved even higher scores, showcasing the remarkable advancements in Japanese natural language processing enabled by these models.
The post Stability AI Releases Two Japanese-based LLMs appeared first on Analytics India Magazine.