You’ve heard the pitch: run AI privately, offline, on your own hardware — no API keys, no usage limits, no data leaving your machine. You open Hugging Face, find a model called Qwen3-30B-A3B-GGUF, download 20GB, try to run it, and your laptop grinds to a halt or produces nothing at all. The problem isn’t that local […]

Read More →

Sharing local LLM models between Ollama and llama.cpp seems like a niche concern until you’ve burned through tens of GB of disk space on duplicate copies of the same model. The two tools use completely different storage formats by default, but you can configure them to share one file. Table of contents The problem: data […]

Read More →

Google DeepMind released Gemma 4 on April 2, 2026 under Apache 2.0. It’s their fourth-generation open model family, and it runs locally with surprisingly little friction. Here are three ways to get it going, depending on what hardware you have in front of you. Table of contents Option 1: On your phone No account, no […]

Read More →

If you’ve tried running a local model through Ollama with Claude Code and been greeted by this message: There’s an issue with the selected model (qwen3-coder:30b). It may not exist or you may not have access to it. Run /model to pick a different model. …even though the model is clearly installed and runs fine […]

Read More →