Baby Llama Runs on Samsung Galaxy Watch 4

Share via:

A user on X, who goes by the name Joey (e/λ), shared a video where he ran ‘llama.c’ on a Samsung Galaxy Watch 4. Baby Llama was created by OpenAI’s Andrej Karpathy as a weekend project with the intention of running Llama 2 on edge devices. 

@karpathy llama2.c running on galaxy watch 4 pic.twitter.com/sMPCZM3WE4

— Joey (e/λ) (@shxf0072) December 18, 2023

Karpathy said that this approach was heavily inspired by Georgi Gerganov’s project – llama.cpp, which was almost the same project of using the first version of LLaMA on a MacBook using C and C++.

Karpathy’s approach involves training the Llama 2 LLM architecture from scratch using PyTorch. After training, he saves the model weights in a raw binary file. The interesting part comes next: he writes a 500-line C file, named ‘run.c‘, which loads the saved model and performs inferences using single-precision floating-point (fp32) calculations. This minimalistic approach ensures a low-memory footprint and requires no external libraries, allowing efficient execution on a single M1 laptop without the need for GPUs.

Karpathy also explores several techniques to improve the performance of the C code, including different compilation flags like -O3, -Ofast, -march=native, and more. These flags optimise the code by enabling vectorization, loop unrolling, and other hardware-specific tuning. By experimenting with these flags, users can achieve even faster inferences on their specific systems.

To try out the baby Llama 2 model on your own device, you can download the pre-trained model checkpoint from Karpathy’s repository. The provided code will enable you to compile and run the C code on your system, offering a glimpse into the magic of running a deep learning model in a minimalistic environment.

It’s crucial to note that Karpathy’s project is a weekend experiment and not intended for production-grade deployment, which he acknowledges. The primary focus of this endeavour was to demonstrate the feasibility of running Llama 2 models on low-powered devices using pure C code, a language that for a long time has been not regarded useful for machine learning as it does not involve GPUs.

The post Baby Llama Runs on Samsung Galaxy Watch 4 appeared first on Analytics India Magazine.

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

Baby Llama Runs on Samsung Galaxy Watch 4

A user on X, who goes by the name Joey (e/λ), shared a video where he ran ‘llama.c’ on a Samsung Galaxy Watch 4. Baby Llama was created by OpenAI’s Andrej Karpathy as a weekend project with the intention of running Llama 2 on edge devices. 

@karpathy llama2.c running on galaxy watch 4 pic.twitter.com/sMPCZM3WE4

— Joey (e/λ) (@shxf0072) December 18, 2023

Karpathy said that this approach was heavily inspired by Georgi Gerganov’s project – llama.cpp, which was almost the same project of using the first version of LLaMA on a MacBook using C and C++.

Karpathy’s approach involves training the Llama 2 LLM architecture from scratch using PyTorch. After training, he saves the model weights in a raw binary file. The interesting part comes next: he writes a 500-line C file, named ‘run.c‘, which loads the saved model and performs inferences using single-precision floating-point (fp32) calculations. This minimalistic approach ensures a low-memory footprint and requires no external libraries, allowing efficient execution on a single M1 laptop without the need for GPUs.

Karpathy also explores several techniques to improve the performance of the C code, including different compilation flags like -O3, -Ofast, -march=native, and more. These flags optimise the code by enabling vectorization, loop unrolling, and other hardware-specific tuning. By experimenting with these flags, users can achieve even faster inferences on their specific systems.

To try out the baby Llama 2 model on your own device, you can download the pre-trained model checkpoint from Karpathy’s repository. The provided code will enable you to compile and run the C code on your system, offering a glimpse into the magic of running a deep learning model in a minimalistic environment.

It’s crucial to note that Karpathy’s project is a weekend experiment and not intended for production-grade deployment, which he acknowledges. The primary focus of this endeavour was to demonstrate the feasibility of running Llama 2 models on low-powered devices using pure C code, a language that for a long time has been not regarded useful for machine learning as it does not involve GPUs.

The post Baby Llama Runs on Samsung Galaxy Watch 4 appeared first on Analytics India Magazine.

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

Australian government drops misinformation bill

The Australian government has withdrawn a bill that...

Latin America fintech will be a market to watch...

Midway through 2024, Mike Packer, a partner at...

Reserve Bank of India expanding cross-border payments platform

According to the Atlantic Council, 134 countries are...

Popular

Upcoming Events

Startup Information that matters. Get in your inbox Daily!