Question 1

What is this local AI field guide?

Accepted Answer

Cairn tells you which open-weight LLMs your machine can actually run. It reads your GPU through browser APIs, estimates VRAM usage, and guesses tokens per second across 50+ models.

Question 2

How the VRAM readout works

Accepted Answer

Each model shows the VRAM it needs at Q4_K_M quantization, in both GB and percent of your VRAM. Over 100% and it won't fit. Tokens per second comes from your GPU's memory bandwidth divided by model size — a rough estimate, usually within 20% of real-world numbers.

Question 3

Privacy — everything runs in your browser

Accepted Answer

No data leaves your browser. Detection, scoring, and ranking all run client-side.

Question 4

Data sources for local LLM specs

Accepted Answer

Model specs come from the llama.cpp, Ollama, and LM Studio compatibility lists.