A user on X, who goes by the name Joey (e/λ), shared a video where he ran ‘llama.c’ on a Samsung Galaxy Watch 4. Baby Llama was created by OpenAI’s Andrej Karpathy as a weekend project with the intention of running Llama 2 on edge devices.
Karpathy said that this approach was heavily inspired by Georgi Gerganov’s project – llama.cpp, which was almost the same project of using the first version of LLaMA on a MacBook using C and C++.
Karpathy’s approach involves training the Llama 2 LLM architecture from scratch using PyTorch. After training, he saves the model weights in a raw binary file. The interesting part comes next: he writes a 500-line C file, named ‘run.c‘, which loads the saved model and performs inferences using single-precision floating-point (fp32) calculations. This minimalistic approach ensures a low-memory footprint and requires no external libraries, allowing efficient execution on a single M1 laptop without the need for GPUs.
Karpathy also explores several techniques to improve the performance of the C code, including different compilation flags like -O3, -Ofast, -march=native, and more. These flags optimise the code by enabling vectorization, loop unrolling, and other hardware-specific tuning. By experimenting with these flags, users can achieve even faster inferences on their specific systems.
To try out the baby Llama 2 model on your own device, you can download the pre-trained model checkpoint from Karpathy’s repository. The provided code will enable you to compile and run the C code on your system, offering a glimpse into the magic of running a deep learning model in a minimalistic environment.
It’s crucial to note that Karpathy’s project is a weekend experiment and not intended for production-grade deployment, which he acknowledges. The primary focus of this endeavour was to demonstrate the feasibility of running Llama 2 models on low-powered devices using pure C code, a language that for a long time has been not regarded useful for machine learning as it does not involve GPUs.
The post Baby Llama Runs on Samsung Galaxy Watch 4 appeared first on Analytics India Magazine.