AI Capabilities and Limitations - Certification Study Guide

27 min read 5548 words

AI Capabilities and Limitations - Certification Study Guide

Course: AI Capabilities and Limitations Modules: 6 Target: All professionals seeking a deeper technical understanding of how LLMs work Difficulty: Intermediate

MODULE 1: Getting Started

Key Notes

Scope of “AI” in this course: Large Language Models (LLMs) specifically — not robotics, computer vision, or narrow AI systems
LLMs: Claude, GPT-4, Gemini, Llama, Mistral — text-in, text-out systems at their core
Understanding HOW AI works is not just academic — it directly explains WHY AI behaves the way it does

How AI Gets Its Character — 4-Stage Training Process:

  Stage 1: PRE-TRAINING
  ─────────────────────────────────────────────────────────────
  Trained on massive text datasets (web, books, code, etc.)
  Learns statistical patterns: what words/concepts follow others
  Develops broad world knowledge and language capability
  Result: a model that can predict text, but with no alignment
  Scale: trillions of tokens; months of compute

  Stage 2: SUPERVISED FINE-TUNING (SFT)
  ─────────────────────────────────────────────────────────────
  Human trainers demonstrate preferred responses
  Model learns to produce helpful, formatted, structured answers
  Result: a model that can follow instructions and be useful
  Scale: thousands to millions of curated examples

  Stage 3: REINFORCEMENT LEARNING FROM HUMAN FEEDBACK (RLHF)
  ─────────────────────────────────────────────────────────────
  Human raters compare responses and rank them
  A "reward model" is trained on these preferences
  The LLM is fine-tuned to maximize reward model scores
  Result: better alignment with human values and preferences

  Stage 4: CONSTITUTIONAL AI / ADDITIONAL ALIGNMENT (varies by provider)
  ─────────────────────────────────────────────────────────────
  Explicit principles guide model behavior
  Model learns to critique and revise its own outputs
  Anthropic's approach: a "constitution" of values + self-critique
  Result: stronger safety and values alignment

Key Insight: AI is a statistical pattern matcher shaped by human feedback — not a thinking being, not a database, not a search engine. Understanding this unlocks understanding of every capability and limitation covered in this course.

What “AI character” means:

Tone, helpfulness, caution, verbosity = shaped by training choices
Safety behaviors = trained, not hard-coded rules
Knowledge = statistical patterns, not stored facts
Reasoning = pattern-based, not logical computation
Different providers make different training choices → different behaviors across models

Training Data Composition (approximate, varies by model):

Web crawl data (Common Crawl and similar): largest share — broad but noisy
Books and long-form text: higher quality writing patterns
Code repositories: programming knowledge and structure
Academic and scientific text: domain knowledge
Curated datasets: filtered for quality
The composition shapes what the model “knows” well vs. poorly

Best Practices

When an AI behaves unexpectedly, ask: “Which stage of training might explain this?”
Treat AI outputs as the most statistically likely response to your input, not the factually correct answer
Understand that different AI providers make different training choices — behavior varies across models

Example

Claude declines to help with certain requests not because it cannot produce the text, but because training shaped it to decline. A different model with different training choices might respond differently. Neither is “smarter” — they reflect different training priorities.

Try It Out: Ask two different AI models the same borderline request (e.g., “write a persuasive essay arguing a controversial position”). Note: do they respond differently? What does the difference tell you about their training choices?

MODULE 2: Next Token Prediction

Key Notes

The core mechanism of every LLM: predict the next most likely token given all previous tokens
This single mechanism explains both the most impressive capabilities and the most confounding failures

What is a Token?

Token ≠ word. A token is roughly 4 characters or 0.75 words on average
“Unbelievable” → approximately 3 tokens: “Un” + “bel” + “ievable”
“the” → 1 token (common words are compact)
“mRNA” → may be 2–3 tokens (technical/rare terms cost more)
Chinese and Japanese characters may be 1–2 tokens each
Whitespace and punctuation are often their own tokens
Why it matters: context windows are measured in tokens, not words — 100K token limit ≠ 100K words

Token Counting Examples:

  "Hello, world!"             → ~4 tokens
  "The quick brown fox"       → ~4 tokens
  "Electroencephalography"    → ~6 tokens
  "AI will transform every
   industry in the next
   decade, experts say."      → ~17 tokens
  Average English text:       ~750 words ≈ 1,000 tokens

How Next-Token Prediction Produces Intelligent-Seeming Output:

  Prompt: "The capital of France is"

  Model calculates probability of every possible next token:
  "Paris"      → 94.2% probability
  "Lyon"       → 1.1% probability
  "a"          → 0.8% probability
  "situated"   → 0.6% probability
  ...

  Selects "Paris" → then predicts next token given "...France is Paris"
  → and so on until a stop token or length limit

Temperature — Controlling Randomness:

Temperature	Effect	Token Selection	Best For
0.0	Fully deterministic	Always highest-probability token	Factual lookups, code generation
0.1–0.3	Near-deterministic	Nearly always top token, slight variation	Technical writing, precise instructions
0.5–0.7	Balanced	Top tokens with moderate variation	General writing, analysis
0.9–1.0	Creative	Broader distribution, more surprise	Brainstorming, creative writing
1.5+	High randomness	Very broad; often incoherent	Experimental use only

Side-by-Side Temperature Examples (same prompt):

  Prompt: "Describe how clouds form."

  Temperature 0.1:
  "Clouds form when water vapor in the atmosphere cools and condenses
   around tiny particles called condensation nuclei, such as dust or
   sea salt. As warm, moist air rises, it expands and cools..."
  [Accurate, predictable, textbook-style]

  Temperature 1.0:
  "Clouds are born from invisible breath — the sky's own exhalation.
   When warm air climbs, carrying its invisible cargo of water molecules,
   it reaches the cold heights where those molecules must make a choice..."
  [More creative, less predictable, still generally accurate]

  Temperature 1.5:
  "Clouds. Water. Cold meets warm in the atmospheric ballet of
   condensation nuclei dancing with vapor — sky's tears waiting to happen,
   meteorologically speaking, of course..."
  [Fragmented, creative, unreliable for factual accuracy]

Capabilities That Emerge from Next-Token Prediction:

Fluent language: predicting plausible next tokens produces grammatical, coherent text naturally
Code generation: code has high statistical regularity — prediction works extremely well for common patterns
Style matching: if you write formally, subsequent tokens are predicted to be formal
Apparent reasoning: chain-of-thought works because reasoning steps are statistically likely to follow each other correctly
Translation: cross-lingual training data teaches token-to-token mappings across languages
Summarization: common text patterns around “in summary” and “key points” provide strong training signal

Limitations That Emerge from Next-Token Prediction:

Hallucinations: the model predicts the most statistically plausible next token — not the most factually accurate. A confident, specific-sounding false statement can be MORE statistically likely than “I don’t know.”
No self-correction by default: the model does not verify output against a knowledge store — it just keeps predicting
Sensitive to phrasing: small changes in input change the probability distribution, which changes the output significantly
Not actually counting or computing: apparent mathematical ability is pattern matching, not calculation — errors increase with complexity

Why Hallucinations Are Fundamental (not a bug to be fixed):

  The model is doing exactly what it is designed to do:
  predicting statistically likely text.

  "The study found that [X]% of adults..."
  → A specific percentage is statistically likely after this phrase
  → The model predicts one (e.g., "73%")
  → The model has no mechanism to check if 73% is real
  → It just knows "73" is a plausible token sequence here

  "According to [Author] ([Year])..."
  → A plausible author name and year are statistically likely
  → The model predicts them confidently
  → The citation may be entirely fabricated

  This is not a failure of the mechanism — it IS the mechanism.
  Verification requires a separate system (search, RAG, human review).

Fluent ≠ Accurate — The Core Confusion: Most AI failures come from users trusting fluency as a signal of accuracy. They are completely independent:

A hallucination reads exactly as well as a true statement
Grammatical perfection has zero correlation with factual correctness
Confident tone is a property of the training data, not a signal of certainty

Best Practices

For precision-critical outputs (statistics, citations, names, dates), always verify externally
Use lower temperature settings for factual or technical tasks; higher for creative tasks
Provide the factual content you need in your prompt — don’t expect the model to recall it accurately
Treat confident AI assertions about specific facts with the same skepticism as a very well-read but unreliable friend

Try It Out: Ask an AI a factual question where you already know the precise answer (a specific statistic from your field, a date, a technical specification). Note whether the answer is correct, close, or wrong — and observe how confidently it was stated regardless of accuracy.

Example

A journalist asks AI for statistics on youth homelessness to include in an article. The AI produces several specific percentages with apparent source attribution. The journalist fact-checks all of them — two are close but not exact, one cannot be found anywhere, and one cites a real report but with a different number. The journalist uses none of the AI statistics, instead asking AI to help structure the article once they have verified the data themselves.

MODULE 3: Knowledge

Key Notes

What LLMs “know”: statistical patterns derived from training data, not stored facts in a database
The distinction matters: a database knows a fact with certainty; an LLM predicts what text about a topic typically says

Training Data Composition and Knowledge Depth:

  HIGH-QUALITY KNOWLEDGE (frequent, consistent training data):
  ─────────────────────────────────────────────────────────────
  Major world events and history (well-documented)
  Popular programming languages (Python, JavaScript, etc.)
  Well-documented science and mathematics
  English-language literature and culture
  Widely-covered current events (up to cutoff)

  VARIABLE KNOWLEDGE (less frequent or inconsistent data):
  ─────────────────────────────────────────────────────────────
  Niche academic fields
  Less-documented languages and cultures
  Recent developments near the knowledge cutoff
  Rapidly evolving fields (AI itself, recent legislation)
  Regional or local information

  LOW-QUALITY / HIGH-HALLUCINATION RISK:
  ─────────────────────────────────────────────────────────────
  Obscure topics with minimal internet presence
  Private organizational knowledge (your company, your team)
  Information that is mostly paywalled or not on the web
  Very recent events (post-cutoff)
  Highly technical specializations with small communities

Knowledge Capabilities:

Capability	Description	Reliability
Breadth	Enormous — trained on virtually every domain	High
Depth on common topics	Strong — common topics appear frequently in training	High
Depth on specialized/rare topics	Variable to low	Medium–Low
Cross-domain synthesis	Strong — can connect concepts across fields	High
Historical knowledge	Strong for well-documented history	High
Technical knowledge	Strong for widely documented technologies	High
Recent events (near cutoff)	Partial, inconsistent	Medium
Post-cutoff events	None — will hallucinate or hedge	Zero

Knowledge Cutoff Implications:

  WELL BEFORE CUTOFF:    Strong, consistent knowledge available
  NEAR CUTOFF:           Partial, inconsistent, may be wrong
  AFTER CUTOFF:          No knowledge — model will either hallucinate
                         or correctly hedge ("I don't have info on that")

  HIGH-RISK DOMAINS FOR CUTOFF ISSUES:
  ─────────────────────────────────────────────────────────────
  Current events and breaking news
  Recently passed or amended legislation
  Current drug approvals and clinical guidelines
  Latest software versions, APIs, and frameworks
  Current market prices and economic data
  Recently published research
  Changes in organizational leadership
  Sports results and standings
  AI model capabilities (the field moves faster than any cutoff)

How AI “Knowledge” Actually Works: AI does not retrieve stored facts. It generates text that is statistically consistent with what training data said about a topic:

Topics that appeared frequently and consistently → high accuracy
Topics with conflicting information in training → AI may present one view or hedge
Topics barely present in training → high hallucination risk, low accuracy
Your private organizational data → AI has no knowledge whatsoever unless you provide it

The RAG Pattern (Retrieval-Augmented Generation):

  THE PROBLEM:
  "What does our policy say about expense reimbursement?"
  → AI may hallucinate a policy that doesn't match yours

  THE SOLUTION (RAG):
  Paste the actual policy text into the prompt, then ask:
  "Based on the policy text above, what does it say about
   expense reimbursement?"
  → AI now reasons from the actual text you provided

  WHY IT WORKS:
  The context window contains the authoritative information.
  Next-token prediction now draws on that text, not training data.
  Accuracy is dramatically higher for specific organizational content.

  COMMON RAG APPLICATIONS:
  ─────────────────────────────────────────────────────────────
  Organizational policies and procedures
  Product documentation and specs
  Legal contracts and agreements
  Scientific papers and reports (paste abstracts/excerpts)
  Financial statements and reports
  Meeting notes and prior decisions

Handling Knowledge Limitations — Decision Framework:

Situation	Strategy
Need current events	Provide context documents; use search-augmented AI
Need recent research	Paste abstracts or excerpts into the prompt
Need organizational knowledge	RAG — paste the relevant document
Need precision facts	Verify independently; don’t rely on AI recall
Need proprietary information	AI cannot know it; provide it explicitly
Near knowledge cutoff	Ask AI to flag its uncertainty; verify externally

Best Practices

Always check the knowledge cutoff date of the AI model you are using
For any claim about recent events, current data, or rapidly changing fields: verify externally
Use the RAG pattern for organizational knowledge: paste documents into context rather than expecting AI to know them
Ask AI to indicate its confidence and flag where its knowledge may be dated

Try It Out: Ask an AI about a development in your field from the last 12 months. Then ask about something that happened 5 years ago in the same field. Compare: where does it hedge? Where is it confident? Does confidence correlate with accuracy in your experience?

Example

A policy analyst asks AI to summarize the current regulatory landscape for fintech lending. AI produces a confident, well-structured summary — but the analyst notices it does not mention a major regulatory change from 8 months ago. The AI’s training cutoff predates the change. The analyst uses AI’s historical framework as a structure, then manually updates with current regulatory information from primary sources.

MODULE 4: Working Memory

Key Notes

Context window = the “working memory” of an LLM — everything it can “see” at once
Everything inside the context window influences the model’s predictions
Everything outside the context window does not exist for the model

Context Window Basics:

Concept	Explanation
Context window	Maximum tokens model can process in one interaction
Input tokens	Tokens in your prompt, documents, conversation history
Output tokens	Tokens the model generates in its response
Total = input + output	Must fit within context window limit
Conversation history	Each turn’s messages accumulate in the context window

Context Window Sizes (approximate, vary by model and version):

Scale	Approximate Tokens	Approximate Words	What Fits
Small	4K–8K tokens	3K–6K words	A few pages of text
Medium	32K–128K tokens	24K–96K words	A short book or long report
Large	200K tokens	~150K words	A full novel
Extended	1M+ tokens	~750K words	Multiple books

The “Lost in the Middle” Effect — Critical Concept:

  CONTEXT STRUCTURE:
  [Beginning] ──────────── [Middle] ──────────── [End]

  Model attention tends to be stronger at the beginning and end.
  Information buried deep in a very long middle section
  may be recalled less reliably in the model's output.

  PRACTICAL IMPLICATION:
  Put your most critical instructions at the BEGINNING of your prompt.
  Put your most important constraints at the END as a reminder.
  Do NOT bury key requirements in the middle of a long preamble.

  RESEARCH FINDING:
  In long contexts, models can "lose" information from the middle even
  when it would be retrievable from shorter contexts. The effect is
  more pronounced in older/smaller models; newer large-context models
  handle it better, but the risk never fully disappears.

Context Window Capabilities:

Can process entire long documents in one interaction (within limits)
Maintains conversation coherence across many turns
Can follow complex multi-step instructions when all steps are in context
Can reference material from earlier in the conversation
Can “remember” persona and constraints set at the start of the session

Context Window Limitations:

Limitation	What It Means	Practical Impact
Finite size	Cannot hold unlimited information	Very long contexts must be truncated or summarized
Lost in the middle	Middle content may be recalled less reliably	Place critical info at start/end
No persistence	Context resets between conversations	No memory of past sessions by default
Cost scales with tokens	Longer contexts cost more to process	Budget implications for API use
Attention degradation	Tracking degrades in very long contexts	Reliability decreases with context length
Output length limit	Responses are also token-limited	Long generations may be cut off

Strategies for Managing Long Contexts:

  STRATEGY 1: Chunking
  Break a large document into sections.
  Process each section individually.
  Ask AI to produce a summary after each section.
  At the end, feed all summaries to AI for synthesis.

  STRATEGY 2: Selective extraction
  Instead of pasting the full document, extract only relevant sections.
  "Here is Section 3.2 of the contract regarding liability..."
  More precise, less noise, better results.

  STRATEGY 3: Summarization handoff
  In a very long conversation, ask AI to summarize the key points so far.
  Start a new conversation with that summary as the opening context.
  Preserves the key information without the full token cost.

  STRATEGY 4: Critical info placement
  Always put key instructions at the start of the prompt.
  Restate critical constraints at the end: "Remember: output as JSON only."
  Don't rely on the model to recall buried middle content.

  STRATEGY 5: /compact or conversation management
  Use available tools to compact/summarize long conversations.
  Some AI interfaces offer built-in context management features.

No Persistent Memory (by default):

Standard AI has no memory between conversations
Each new conversation starts from zero — it does not know you or your history
Workaround: paste relevant context from previous sessions into the new conversation
Some tools offer memory features — understand exactly what is stored, how, and for how long
Memory features vary significantly by provider and product tier

Practical Limits for Common Tasks:

  FITS COMFORTABLY IN CONTEXT:
  ─────────────────────────────────────────────
  A 20-page report (200K+ token models)
  An hour of meeting transcript (~7,500 words)
  A short codebase (single files or small modules)
  A research paper with discussion (10,000–15,000 words)

  APPROACHES LIMITS / REQUIRES MANAGEMENT:
  ─────────────────────────────────────────────
  A full book (50,000+ words)
  A large codebase (multiple files, many functions)
  A months-long email chain
  Multiple long documents simultaneously

  BEYOND PRACTICAL LIMITS (even large context models):
  ─────────────────────────────────────────────
  An entire database
  All files in a large project
  Continuous real-time data streams

Best Practices

Place the most critical instructions at the beginning of your prompt, not the end of a long preamble
For very long documents, guide AI to specific sections rather than asking it to “read everything”
Periodically summarize and restate key context in long conversations
Never assume AI remembers anything from a previous session

Try It Out: Start a conversation with AI, establish a specific persona and set of constraints (e.g., “You are a concise communicator. Always respond in exactly 3 bullet points. Never use more than 10 words per bullet.”). Have a long back-and-forth exchange on an unrelated topic (20+ messages). Then ask a new question and see whether the original constraints are still honored. This demonstrates instruction drift in long contexts.

Example

A legal team uses AI to review a 200-page contract for risk clauses. Instead of pasting all 200 pages and asking “find all risks,” they work section by section, asking AI to analyze one section at a time and produce a risk summary. At the end, they feed all summaries to AI and ask for a consolidated risk report. This keeps each interaction well within context limits, avoids attention degradation, and produces more reliable results than one massive context dump.

MODULE 5: Steerability

Key Notes

Steerability = the ability to direct AI behavior through instructions, persona, constraints, and context
Steerability is what makes AI useful across diverse tasks — the same model can be a tutor, analyst, coder, or creative writer
Steerability also has limits — training shapes what the model will and won’t do

How Steerability Works (mechanically): Instructions become part of the context window. They shift the probability distribution of next-token prediction.

“Respond only in bullet points” makes bullet-point-starting tokens more likely
“You are a formal business analyst” makes formal, analytical token sequences more likely
The model does not “obey” in a mechanical sense — instructions weight the probabilities
Very long or complex instructions may only partially shift the distribution

Steerability Capabilities:

Capability	Example	How Well It Works
Instruction following	“Respond only in bullet points” → AI uses bullets	Very well for simple, specific instructions
Tone adaptation	“Be formal and concise” → AI adjusts register	Well; more pronounced with examples
Role adoption	“You are a financial analyst” → AI adjusts expertise	Well; domain knowledge depth still limited by training
Format control	“Output as a markdown table” → structured output	Very well for well-defined formats
Style matching	“Write like my example” → AI matches style	Well with a clear example provided
Constraint adherence	“Do not mention competitors” → AI avoids competitors	Well for simple constraints; degrades with complexity
Persona maintenance	“Always respond as [character]” → sustained persona	Degrades in very long conversations
Language selection	“Respond only in French” → French output	Well; quality varies by language support

System Prompts vs. User Prompts:

  SYSTEM PROMPT (when available):
  ─────────────────────────────────────────────────────────────
  Set at conversation start by the deployer/operator
  Establishes persistent context, role, constraints for entire session
  Generally has stronger influence than user messages
  Invisible to the end user in many deployments
  Used by organizations to create consistent AI behavior

  USER PROMPT:
  ─────────────────────────────────────────────────────────────
  Individual turn instructions from the user
  Can override system prompt instructions in some cases
  Influence may degrade over long conversations

  INSTRUCTION HIERARCHY (typical):
  Training constraints > System prompt > User instructions
  (Safety training beats everything; system prompt beats user for most things)

Steerability Limitations:

Limitation	Description	How to Handle
Instruction complexity cap	Very long multi-step instructions may not all be followed	Break into simpler, sequential instructions
Instruction drift	In long conversations, early instructions lose influence	Restate key constraints periodically
Training boundaries	Model will resist instructions conflicting with safety training	Work within the constraints; do not try to circumvent
Competing instructions	Conflicting instructions produce unpredictable results	Resolve conflicts before prompting
Persona breaks	Sustained persona may break under pressure or in long sessions	Re-establish persona explicitly when needed
Over-specification	Too many constraints at once reduces output quality	Prioritize; use 3–5 key constraints maximum
Format regression	AI may revert to default format in long conversations	Restate format requirements as needed

The Steerability-Safety Interaction:

  TRAINING INSTALLS FLOOR BEHAVIORS:
  ─────────────────────────────────────────────────────────────
  Certain behaviors are trained to be very resistant to steering.
  This is intentional: prevents steering AI toward harmful outputs.

  WHEN AI "REFUSES" AN INSTRUCTION:
  Training is working as designed. The refusal is a trained behavior.

  JAILBREAKING:
  Attempts to circumvent safety behaviors through clever prompting.
  Modern models are robustly trained against common jailbreak attempts.
  Attempting jailbreaks in professional contexts violates acceptable use.

  WHAT IS NOT A SAFETY ISSUE (just needs better prompting):
  Getting AI to write in a specific format it's struggling with
  Getting AI to maintain a tone it keeps drifting from
  Getting AI to stay on topic in long conversations
  These are steerability challenges, not safety interactions.

Optimal Steerability Patterns:

  EFFECTIVE STEERING:
  ───────────────────────────────────────────────────────────
  Role + Task + Constraints + Format = clear, specific output

  "You are a senior data analyst (role). Analyze the attached
  sales data (task). Focus only on Q3 2024 (constraint).
  Output a 5-bullet executive summary (format)."

  INEFFECTIVE STEERING:
  ───────────────────────────────────────────────────────────
  "Can you help me with this data?"
  [No role, no task specification, no constraints, no format]

  OVER-SPECIFIED (too many constraints):
  ───────────────────────────────────────────────────────────
  "Use exactly 47 words, bullet points only, each starting with a
  verb, no adjectives, formal register, British spelling, avoid
  passive voice, use Oxford commas, include exactly 3 citations..."
  [Competing constraints; model cannot satisfy all simultaneously]

Practical Steerability Applications:

Customer service deployment:

System prompt: "You are a customer service representative for [Company].
Be warm, empathetic, and professional. Keep responses under 150 words.
Focus only on [product] support questions. If a question is about billing,
say: 'I'll connect you with our billing team.' Do not discuss competitors."

Internal analyst assistant:

System prompt: "You are a business analyst assistant for [Company]'s
strategy team. Be concise and data-focused. Always structure answers
with: Key Finding, Supporting Evidence, Recommended Action.
Flag explicitly when you are uncertain about a fact."

Educational tutor:

System prompt: "You are a patient tutor for introductory calculus students.
Never give the answer directly — always guide through questions.
If the student is stuck after 3 attempts, provide a hint, not the answer.
Celebrate correct reasoning, not just correct answers."

Best Practices

Invest time in role and constraint specification — this is the highest-leverage prompting investment
For recurring tasks: build a reusable system prompt or prompt template that captures all steering
If AI ignores an instruction: restate it, place it closer to the beginning, or simplify competing constraints
Do not fight training boundaries — work within them to find compliant approaches

Try It Out: Write a system prompt for an AI assistant for a specific role (your job, a student helper, a writing coach). Deploy it and have a 10-message conversation. Then deliberately try to get the AI to break its persona or constraints through normal conversation pressure (not jailbreaking — just sustained off-topic conversation). Observe when and how the persona degrades.

Example

A customer service team deploys an AI assistant. Without a system prompt: the AI responds in varying tones, sometimes too casual, sometimes too long, occasionally going off-topic. With a well-crafted system prompt: “You are a customer service representative for [Company]. Always be warm and professional. Keep responses under 150 words. Focus only on [product] support. Escalate billing questions to a human agent.” — behavior is consistent, on-brand, and within scope.

MODULE 6: Conclusion — When Properties Collide

Key Notes

The four properties (next-token prediction, knowledge, working memory, steerability) do not operate independently — they interact, amplify each other, and sometimes conflict
Understanding which property is “responsible” for an unexpected output leads to targeted solutions

Property Interactions — Positive:

Combination	Emergent Capability
Knowledge + Steerability	Powerful Q&A with domain expertise and controlled format
Context Window + Steerability	Consistent multi-step, long-form task completion
Next-token + Knowledge	Fluent, contextually appropriate writing across domains
All four aligned	Complex reasoning tasks, nuanced analysis, extended projects
Steerability + Context Window	Persona maintained consistently across a long session
Knowledge + Next-token	Sophisticated analogies and cross-domain synthesis

Property Interactions — Problematic (8+ collision scenarios):

Collision Scenario	Properties Involved	What Happens	Diagnosis
Confident hallucination	Next-token + Knowledge gap	AI states false information with high fluency and confidence	Most dangerous failure mode
Instruction drift in long sessions	Steerability + Working memory	Early constraints gradually lose influence as context grows	Common in multi-turn sessions
Outdated confident advice	Knowledge (cutoff) + Steerability	AI confidently applies a steered role using outdated information	Knowledge cutoff + role adoption
Context truncation	Working memory (limits) + Steerability	Instructions truncated; AI forgets constraints set earlier	Long context + instruction placement
Fabricated domain citation	Next-token + Knowledge + Steerability	Steered to expert role, AI predicts domain-appropriate but fabricated citations	Expert role amplifies hallucination confidence
Inconsistency across sessions	Working memory (no persistence) + Steerability	Same question gets different answers in different sessions	No memory; different random seeds
Mid-context confusion	Working memory (lost in middle) + Knowledge	AI ignores relevant context pasted into the middle of a long prompt	Attention degradation
Safety-steerability tension	Steerability + Training	Instructions ask AI to do something training resists	Training floors override steerability

The “Which Property Caused This?” Decision Tree:

  UNEXPECTED AI BEHAVIOR
          │
          ▼
  Is the output factually wrong or fabricated?
  ├── Yes → Is this about post-cutoff events?
  │         ├── Yes → KNOWLEDGE (cutoff) issue
  │         │         Fix: provide the correct info; use RAG
  │         └── No  → NEXT-TOKEN PREDICTION (hallucination) issue
  │                   Fix: verify; provide context; don't rely on AI recall
  └── No  → Continue
          │
          ▼
  Is AI ignoring your instructions?
  ├── Yes → Is this a safety/ethical refusal?
  │         ├── Yes → STEERABILITY + TRAINING interaction
  │         │         Fix: work within constraints; rephrase the request
  │         └── No  → STEERABILITY (drift or complexity) issue
  │                   Fix: simplify; restate; move instructions to start
  └── No  → Continue
          │
          ▼
  Is AI forgetting earlier context?
  ├── Yes → Is this between sessions?
  │         ├── Yes → WORKING MEMORY (no persistence)
  │         │         Fix: provide context at session start each time
  │         └── No  → WORKING MEMORY (lost in middle or window size)
  │                   Fix: restate; summarize; restructure prompt
  └── No  → Continue
          │
          ▼
  Is output inconsistent with identical prompts?
  └── Yes → NEXT-TOKEN PREDICTION (temperature/randomness)
            Fix: lower temperature for deterministic tasks

The Capability-Limitation Continuum (Expanded): Every strength has a shadow — produced by the same mechanism.

  PROPERTY         CAPABILITY                    LIMITATION
  ──────────────────────────────────────────────────────────────────────
  Next-token       Fluent, coherent,             Hallucinations; fluency ≠
  prediction       grammatically perfect text    accuracy; no self-check

  Next-token       Style matching, tone          Sensitive to phrasing;
  prediction       adaptation, creative writing  different prompts = different outputs

  Knowledge        Broad domain expertise        Knowledge cutoff; accuracy
                   across virtually all fields   degrades for rare topics

  Knowledge        Cross-domain synthesis,       Overconfidence about topics
                   analogies, connections        with limited training data

  Working memory   Process long documents;       Finite window; lost-in-middle;
  (context window) maintain conversation         no persistence between sessions
                   coherence

  Working memory   Follow complex multi-step     Instruction drift in long
                   instructions within session   conversations

  Steerability     Adapts to any task,           Safety limits that cannot be
                   persona, format, domain       overridden; training floors

  Steerability     Consistent behavior in        Over-specification degrades
                   deployed systems              output; persona breaks down

Diagnostic Framework — Full Version:

Symptom	Most Likely Property	Targeted Solution
Confident false information	Next-token + Knowledge	Provide correct info in prompt; verify outputs
Fabricated citation	Next-token prediction	Never use AI citations without verification
Doesn’t know recent event	Knowledge (cutoff)	Provide the information in the prompt
Ignores instructions	Steerability (drift or conflict)	Restate; simplify; reposition
Refuses a reasonable request	Steerability + Training	Rephrase; work within training constraints
Wrong format	Steerability (insufficient specificity)	Be more explicit; provide an example
Forgets earlier context	Working memory (window size)	Restate key context; use compact/summarize
Inconsistent across sessions	Working memory (no persistence)	Provide context at session start each time
Different outputs to same prompt	Next-token (temperature)	Lower temperature; accept natural variation
Loses persona in long chat	Steerability + Working memory	Restate persona; use system prompt
Too verbose / too brief	Steerability (missing constraints)	Specify length explicitly
Misses key detail in long doc	Working memory (lost in middle)	Restructure; put key content at start/end

The Practitioner’s Mental Model:

  When AI surprises you, ask:

  1. Is this a KNOWLEDGE issue?
     → Was the info in training data? Is it post-cutoff? Is it obscure?
     → Fix: provide the information yourself (RAG pattern)

  2. Is this a NEXT-TOKEN issue?
     → Is AI predicting plausible text rather than accurate text?
     → Fix: verify; provide context; use lower temperature

  3. Is this a WORKING MEMORY issue?
     → Did key context fall out of the window or get ignored?
     → Fix: restate; summarize; restructure prompt; start fresh

  4. Is this a STEERABILITY issue?
     → Is AI not following instructions as intended?
     → Fix: clarify; simplify; reposition; check for conflicts

The Honest Assessment of AI:

AI is not artificially constrained intelligence waiting to be unleashed
AI is a genuinely novel kind of system with genuine strengths and genuine structural limits
The limits are not failures — they are properties of the mechanism
AI fluency means working skillfully within those properties, not around them
The practitioners who get the most from AI are those with accurate mental models — not those with the highest expectations

Best Practices

Develop the habit of diagnosing unexpected outputs before retrying — random retry rarely fixes the underlying issue
Match your mitigation strategy to the property causing the problem
Maintain realistic expectations: AI is extraordinary at some things and structurally incapable of others
Share your understanding of AI properties with colleagues — most frustration with AI comes from mismatched expectations

Try It Out: Deliberately induce each of the four failure modes:

Hallucination: Ask AI about a real but obscure person or paper — see if it fabricates details
Knowledge cutoff: Ask about a recent event you know occurred after the training cutoff
Lost in middle: Paste a long document; hide a specific unusual instruction in the middle; ask AI to follow it
Instruction drift: Set 5 strict formatting constraints; have a 15-message conversation; see which constraints survive

Example

A researcher notices AI keeps returning the same hallucinated citation despite being told the citation is wrong. Diagnosis: this is a next-token prediction issue — the statistically likely token sequence after this topic continues to predict that citation. Random correction mid-conversation does not update the model’s weights. Solution: provide the correct citation explicitly in the prompt, ask AI to use only sources provided, and verify independently. The researcher now understands why “just tell it not to” doesn’t fix hallucinations.

Final Checklist

I can explain the 4-stage process by which an LLM gets its character (pre-training, SFT, RLHF, alignment)
I can describe next-token prediction in plain language
I can explain why hallucinations are a structural feature of next-token prediction, not a fixable bug
I can define “token” and explain why it differs from a word, with examples
I can explain what temperature controls and give examples of when to use low vs. high settings
I can explain why fluent text does not imply accurate text
I can explain what a knowledge cutoff is and name 5+ high-risk domains for cutoff issues
I can describe the RAG pattern and explain when to use it
I can describe training data composition and how it affects knowledge depth
I can explain what a context window is and name the “lost in the middle” effect
I can describe strategies for managing long contexts (chunking, summarization, selective extraction)
I can explain why AI has no persistent memory by default
I can name 5+ steerability capabilities and 5+ steerability limitations
I can explain the difference between system prompts and user prompts
I can explain how safety training interacts with steerability
I can use the “which property caused this?” decision tree to diagnose an unexpected output
I can pair each property’s limitation with a targeted mitigation strategy
I can name at least 6 collision scenarios where properties interact problematically
I can use the capability-limitation continuum to explain any LLM behavior

AI Capabilities and Limitations - Certification Study Guide#

MODULE 1: Getting Started#

Key Notes#

Best Practices#

Example#

MODULE 2: Next Token Prediction#

Key Notes#

Best Practices#

Example#

MODULE 3: Knowledge#

Key Notes#

Best Practices#

Example#

MODULE 4: Working Memory#

Key Notes#

Best Practices#

Example#

MODULE 5: Steerability#

Key Notes#

Best Practices#

Example#

MODULE 6: Conclusion — When Properties Collide#

Key Notes#

Best Practices#

Example#

Final Checklist#

AI Capabilities and Limitations - Certification Study Guide

MODULE 1: Getting Started

Key Notes

Best Practices

Example

MODULE 2: Next Token Prediction

Key Notes

Best Practices

Example

MODULE 3: Knowledge

Key Notes

Best Practices

Example

MODULE 4: Working Memory

Key Notes

Best Practices

Example

MODULE 5: Steerability

Key Notes

Best Practices

Example

MODULE 6: Conclusion — When Properties Collide

Key Notes

Best Practices

Example

Final Checklist