| Knowledge freshness | Reindex on demand. Same-day updates trivial. | Stale at training time. New facts require a new run. | Fresh facts via retrieval, learned style baked in. |
| Cost per query | Higher. Retrieval + longer prompts. | Lower. Smaller prompts after training. | Between the two; depends on retrieval depth. |
| Cost per training run | Zero. No training step. | Real. From a few dollars (LoRA on 7B) to four figures on large models. | Real. Same as fine-tuning plus retrieval setup. |
| Citation support | Native. Each chunk has a source. | Not honest. Cannot trace weights to a doc. | Citations come from the retrieval half. |
| Latency | +80-300ms for retrieval depending on index. | Lower. No retrieval step. Smaller models possible. | Same as RAG, plus the inference path. |
| Best-fit corpus size | Tens to millions of docs. | Knowledge-agnostic. Tunes behaviour, not facts. | Any corpus size. |
| Domain adaptation | Limited. Model still reasons in its base style. | Strong. Adapts tone, schema, persona, refusal policy. | Best of both. Most-used pattern in production. |
| Time to ship | Days for a baseline; weeks to harden eval. | Weeks. Includes data collection + label QA. | Weeks. Both pipelines must land. |
| When NOT to use it | Fixed style requirements, sub-200ms budgets. | Daily-changing facts, citation requirements. | Tiny corpora or single-purpose prompts. |