this post was submitted on 02 Feb 2025
35 points (94.9% liked)

LocalLLaMA

3125 readers
34 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago
MODERATORS
 

Hello, everyone! I wanted to share my experience of successfully running LLaMA on an Android device. The model that performed the best for me was llama3.2:1b on a mid-range phone with around 8 GB of RAM. I was also able to get it up and running on a lower-end phone with 4 GB RAM. However, I also tested several other models that worked quite well, including qwen2.5:0.5b , qwen2.5:1.5b , qwen2.5:3b , smallthinker , tinyllama , deepseek-r1:1.5b , and gemma2:2b. I hope this helps anyone looking to experiment with these models on mobile devices!


Step 1: Install Termux

  1. Download and install Termux from the Google Play Store or F-Droid

Step 2: Set Up proot-distro and Install Debian

  1. Open Termux and update the package list:

    pkg update && pkg upgrade
    
  2. Install proot-distro

    pkg install proot-distro
    
  3. Install Debian using proot-distro:

    proot-distro install debian
    
  4. Log in to the Debian environment:

    proot-distro login debian
    

    You will need to log-in every time you want to run Ollama. You will need to repeat this step and all the steps below every time you want to run a model (excluding step 3 and the first half of step 4).


Step 3: Install Dependencies

  1. Update the package list in Debian:

    apt update && apt upgrade
    
  2. Install curl:

    apt install curl
    

Step 4: Install Ollama

  1. Run the following command to download and install Ollama:

    curl -fsSL https://ollama.com/install.sh | sh
    
  2. Start the Ollama server:

    ollama serve &
    

    After you run this command, do ctrl + c and the server will continue to run in the background.


Step 5: Download and run the Llama3.2:1B Model

  1. Use the following command to download the Llama3.2:1B model:
    ollama run llama3.2:1b
    
    This step fetches and runs the lightweight 1-billion-parameter version of the Llama 3.2 model .

Running LLaMA and other similar models on Android devices is definitely achievable, even with mid-range hardware. The performance varies depending on the model size and your device's specifications, but with some experimentation, you can find a setup that works well for your needs. I’ll make sure to keep this post updated if there are any new developments or additional tips that could help improve the experience. If you have any questions or suggestions, feel free to share them below!

– llama

you are viewing a single comment's thread
view the rest of the comments