🦙 Ollama Locals — Best Models for 16GB Macs
Researched by Loomy Bot. Ranked best to worst. Run local or launch with Hermes.
🥇Qwen3.5 9B5.3 GB
Best overall. Hermes official pick. Beats models 3x its size.
Pullollama pull qwen3.5:9b
Runollama run qwen3.5:9b
Hermeshermes chat -m qwen3.5:9b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3.5:9b
🥈Qwen3 30B-A3B MoE6 GB
30B quality at 3B speed. The secret weapon for small Macs.
Pullollama pull qwen3:30b-a3b
Runollama run qwen3:30b-a3b
Hermeshermes chat -m qwen3:30b-a3b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3:30b-a3b
🥉Qwen3-Coder 30B-A3B MoE6 GB
Same MoE trick, optimized for code + agent tasks.
Pullollama pull qwen3-coder:30b-a3b
Runollama run qwen3-coder:30b-a3b
Hermeshermes chat -m qwen3-coder:30b-a3b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3-coder:30b-a3b
4GLM-4.7-Flash5 GB
Most obedient tool caller. More precise than Qwen.
Pullollama pull glm-4.7-flash
Runollama run glm-4.7-flash
Hermeshermes chat -m glm-4.7-flash --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/glm-4.7-flash
5Qwen3 8B5 GB
Reliable all-rounder with /think mode.
Pullollama pull qwen3:8b
Runollama run qwen3:8b
Hermeshermes chat -m qwen3:8b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3:8b
6DeepSeek R1 14B8.7 GB
Reasoning champion. Shows thinking chain. Tight fit on 16GB.
Pullollama pull deepseek-r1:14b
Runollama run deepseek-r1:14b
Hermeshermes chat -m deepseek-r1:14b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/deepseek-r1:14b
7Qwen3 14B9 GB
Big quality jump from 8B. Tight on 16GB.
Pullollama pull qwen3:14b
Runollama run qwen3:14b
Hermeshermes chat -m qwen3:14b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3:14b
8Qwen3-Coder 14B9 GB
Code-specialized 14B. Same tight fit.
Pullollama pull qwen3-coder:14b
Runollama run qwen3-coder:14b
Hermeshermes chat -m qwen3-coder:14b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen3-coder:14b
9Phi-4 14B9 GB
STEM specialist. Fast and precise.
Pullollama pull phi4:14b
Runollama run phi4:14b
Hermeshermes chat -m phi4:14b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/phi4:14b
10Phi-4 Mini 3.8B2.5 GB
Tiny but smart. Good for quick Q&A, NOT for agent tasks.
Pullollama pull phi4-mini
Runollama run phi4-mini
Hermeshermes chat -m phi4-mini --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/phi4-mini
11Llama 3.1 8B4.7 GB
Classic pick, good for RAG/docs.
Pullollama pull llama3.1:8b
Runollama run llama3.1:8b
Hermeshermes chat -m llama3.1:8b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/llama3.1:8b
12Qwen2.5 Coder 7B4.4 GB
Best small dedicated coder.
Pullollama pull qwen2.5-coder:7b
Runollama run qwen2.5-coder:7b
Hermeshermes chat -m qwen2.5-coder:7b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/qwen2.5-coder:7b
13DeepSeek R1 7B5 GB
Basic reasoning on budget.
Pullollama pull deepseek-r1:7b
Runollama run deepseek-r1:7b
Hermeshermes chat -m deepseek-r1:7b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/deepseek-r1:7b
14Llama 3.2 Vision 11B8 GB
Image analysis/OCR. Not for agent work.
Pullollama pull llama3.2-vision:11b
Runollama run llama3.2-vision:11b
Hermeshermes chat -m llama3.2-vision:11b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/llama3.2-vision:11b
15Gemma 3 4B3 GB
Lightweight tasks only.
Pullollama pull gemma3:4b
Runollama run gemma3:4b
Hermeshermes chat -m gemma3:4b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/gemma3:4b
16Llama 3.2 3B2 GB
8GB starter model.
Pullollama pull llama3.2
Runollama run llama3.2
Hermeshermes chat -m llama3.2 --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/llama3.2
17Mistral 7B4.1 GB
Outperformed by Qwen in 2026. Skip.
Pullollama pull mistral:7b
Runollama run mistral:7b
Hermeshermes chat -m mistral:7b --provider custom --base-url http://localhost:11434/v1
Switch/model ollama/mistral:7b