AI fundamentals: a learner's glossary from LLMs to quantization

Notes from learning modern AI — LLMs, tokenization, vectors, RAG, MCP, reasoning models, and how models get smaller and faster without starting from scratch.

May 27, 2026

AI fundamentals: a learner's glossary

I have been learning how modern AI systems actually work — not just how to call an API, but what happens under the hood when text goes in and text comes out. These notes are one section per concept, written in plain language while I am still connecting the dots myself.

LLM (Large Language Model)

A large language model is a neural network trained to predict the next token in an input sequence.

At each step the model does not output a single word immediately. It predicts probabilities for many possible next tokens, assigns a score to each, and the best candidate (or one sampled from the distribution) is passed forward. That process repeats: predict, pick, append, predict again — until the model decides the sequence is complete.

How does it know when to stop? During training, sequences include a special end-of-sequence token (often written as ). The model learns that after enough of a reply, the probability of emitting that end token rises. When the end token wins, generation stops.

AI fundamentals: a learner's glossary from LLMs to quantization

AI fundamentals: a learner's glossary

LLM (Large Language Model)

Tokenization

Vectors (embeddings)

Self-supervised learning

Transformer

Fine-tuning

Few-shot prompting

RAG (Retrieval Augmented Generation)

Vector database

MCP (Model Context Protocol)

Context engineering

Reinforcement learning (from human feedback)

Chain of thought (reasoning models)

SLM (Small Language Model)

Parameters

Distillation

Quantization

How these pieces fit together

AI fundamentals: a learner's glossary

LLM (Large Language Model)

Tokenization

Vectors (embeddings)

Self-supervised learning

Transformer

Fine-tuning

Few-shot prompting

RAG (Retrieval Augmented Generation)

Vector database

MCP (Model Context Protocol)

Context engineering

Reinforcement learning (from human feedback)

Chain of thought (reasoning models)

Multi-modal models

SLM (Small Language Model)

Parameters

Distillation

Quantization

How these pieces fit together