A plain-English reference guide covering the jargon that shows up every time a new language model drops, from parameter counts to quantization methods. Contents 01 · Architecture & Model Design — Transformer · Dense Model · Mixture of Experts · Active Parameters · Feed-Forward Network · Layers · Hidden Dimension · Attention Heads 02 · Attention Mechanisms — Multi-Head Attention · Multi-Query Attention · Grouped-Query Attention · KV Cache · Sliding Window Attention · RoPE · RoPE Theta 03 · Sizing, Scale & Counting — Parameters · Embedding Parameters · Non-Embedding […]

Read More →

A format designed for bloggers in 2004 now sits at the center of how AI systems read, write, and think. Table of contents How we got here If you work with LLMs at all, you’ve probably noticed something: Markdown is everywhere. Ask Claude a question, you get Markdown back. Ask GPT-4, same thing. Feed a […]

Read More →

DeepSeek, a powerful open-source LLM, can be easily run locally on your desktop/laptop using Ollama. I’m using an M1 MacBook Pro with 32GB. Ollama simplifies the process of running large language models, handling dependencies and providing a consistent interface. This guide will walk you through installing DeepSeek via Ollama, making it accessible with just a […]

Read More →