Lab: Domain 2 — Model Development Hands-On

14 min read 2945 words

Table of Contents

Lab: Domain 2 — Model Development Hands-On

Lab: Domain 2 — Model Development Hands-On

These labs cover the full model development lifecycle: training, hyperparameter tuning, AutoML, bias detection, and working with foundation models on Bedrock.

Lab 1: Train an XGBoost Model on SageMaker

What Is This Lab About?

XGBoost is the most tested algorithm on the MLA-C01 exam. It’s the default choice for tabular/structured data (CSVs, database exports, feature tables). This lab walks through a complete training job — from data upload to model artifact — so you understand every component the exam asks about.

What You’ll Build

┌─────────────────────────────────────────────────────────────┐
│                  SageMaker Training Job                      │
│                                                             │
│  YOU PROVIDE:                  SAGEMAKER MANAGES:           │
│  ┌──────────────┐              ┌────────────────────────┐   │
│  │ Training data │──(S3)──────▶│ 1. Provision ml.m5     │   │
│  │ (CSV)         │             │ 2. Pull XGBoost        │   │
│  └──────────────┘              │    container from ECR   │   │
│  ┌──────────────┐              │ 3. Download data from   │   │
│  │ Hyperparams  │─────────────▶│    S3 to /opt/ml/      │   │
│  │ (max_depth,  │             │ 4. Run training         │   │
│  │  eta, etc.)  │             │ 5. Upload model.tar.gz  │   │
│  └──────────────┘              │    back to S3           │   │
│  ┌──────────────┐              │ 6. Terminate instance   │   │
│  │ Instance type │             │    (you stop paying)    │   │
│  │ (ml.m5.xl)   │             └────────────────────────┘   │
│  └──────────────┘                                           │
│                                                             │
│  CONTAINER DIRECTORY:                                       │
│  /opt/ml/                                                   │
│  ├── input/                                                 │
│  │   ├── config/hyperparameters.json   ← your settings      │
│  │   └── data/                                              │
│  │       ├── train/train.csv           ← training data      │
│  │       └── validation/val.csv        ← validation data    │
│  ├── model/                            ← model saved here   │
│  └── output/failure                    ← errors go here     │
└─────────────────────────────────────────────────────────────┘

Why This Matters for the Exam

Questions about training jobs test:

Instance type selection — CPU (ml.m5) for XGBoost, GPU (ml.p3) for deep learning
Input modes — File vs Pipe vs FastFile
Spot training — up to 90% savings with checkpointing
Container paths — where data and models live inside the container
Hyperparameters — what each one does, especially scale_pos_weight for imbalanced data

SageMaker Training Lifecycle

What’s ACTUALLY Happening When You Call `.fit()`?

A training job isn’t a function call — it’s an entire infrastructure lifecycle:

1. PROVISIONING (1-5 minutes)
   AWS spins up EC2 instances of the type you requested (ml.m5.xlarge).
   These are real servers in an AWS data center — with CPU, memory, disk.
   You're paying from THIS MOMENT, not from when training starts.

2. CONTAINER PULL (30-60 seconds)
   The XGBoost Docker container image is pulled from Amazon ECR
   (Elastic Container Registry). This image contains the XGBoost
   algorithm code, Python runtime, and all dependencies.
   → This is why BYOC (Bring Your Own Container) pushes to ECR.

3. DATA DOWNLOAD (seconds to hours, depends on dataset size)
   File Mode:     S3 → download ENTIRE dataset to instance EBS disk
                  Fast reads but slow start. Needs enough disk.
   Pipe Mode:     S3 → stream as Linux FIFO pipe. No disk needed.
                  Only works with RecordIO format. Sequential only.
   FastFile Mode: S3 → POSIX mount. Loads pages on demand.
                  Any format, random access, no full download.

4. TRAINING (minutes to days)
   The container runs your algorithm. Metrics stream to CloudWatch
   in real-time. If Debugger is enabled, tensor values are captured.
   Checkpoints (if configured) are saved to S3 periodically.

5. MODEL UPLOAD (seconds)
   The trained model artifact (/opt/ml/model/) is tar-gzipped
   and uploaded to your specified S3 output path.

6. CLEANUP
   EC2 instances terminated. EBS volumes deleted. You stop paying.

Total cost = instance-hours from step 1 through step 6 (not just step 4!)

Why this matters: “Training takes 30 min but costs more than expected” → provisioning + data download may add 10+ minutes. Use Warm Pools to skip provisioning on repeated jobs. Use Pipe/FastFile mode to skip data download.

Step 1: Setup

import sagemaker
import boto3
from sagemaker import get_execution_role
from sagemaker.inputs import TrainingInput

role = get_execution_role()
session = sagemaker.Session()
region = session.boto_region_name
bucket = session.default_bucket()
prefix = "xgboost-lab"

Step 2: Upload Training Data

SageMaker expects your data in S3. For XGBoost, the first column should be the label, remaining columns are features, no header.

train_path = session.upload_data("train.csv", bucket=bucket, key_prefix=f"{prefix}/train")
val_path = session.upload_data("validation.csv", bucket=bucket, key_prefix=f"{prefix}/validation")

Step 3: Configure the Estimator

This is where you make the key decisions the exam tests.

from sagemaker.estimator import Estimator

# Get the official XGBoost container image
image_uri = sagemaker.image_uris.retrieve(
    framework="xgboost", region=region, version="1.5-1", py_version="py3",
)

xgb = Estimator(
    image_uri=image_uri,
    role=role,
    instance_count=1,
    instance_type="ml.m5.xlarge",            # CPU — XGBoost doesn't benefit from GPU
    output_path=f"s3://{bucket}/{prefix}/output",
    sagemaker_session=session,

    # COST SAVING: Use Spot Instances
    use_spot_instances=True,                  # up to 90% cheaper
    max_run=3600,                             # max training time: 1 hour
    max_wait=7200,                            # max wait for spot: 2 hours (MUST > max_run)
    checkpoint_s3_uri=f"s3://{bucket}/{prefix}/checkpoints",  # REQUIRED for spot
)

Why ml.m5.xlarge and not a GPU? XGBoost builds decision trees — tree construction is inherently sequential and doesn’t parallelize well on GPUs. Using ml.p3 (GPU) for XGBoost wastes money. This is a common exam trap.

Spot training requires ALL THREE: use_spot_instances=True + max_wait > max_run + checkpoint_s3_uri. Miss any one and the job either fails or loses progress on interruption.

Step 4: Set Hyperparameters

Each parameter controls a different aspect of how XGBoost learns. Understanding these is heavily tested.

xgb.set_hyperparameters(
    objective="binary:logistic",    # binary classification (outputs probability)
    num_round=100,                  # 100 boosting rounds (trees built sequentially)
    max_depth=5,                    # each tree can be 5 levels deep
                                    #   higher → more complex, risk overfitting
                                    #   lower → simpler, may underfit
    eta=0.2,                        # learning rate (shrinks each tree's contribution)
                                    #   higher → faster training, risk overfitting
                                    #   lower → needs more rounds, better generalization
    gamma=4,                        # minimum loss reduction to make a split
                                    #   higher → fewer splits, simpler trees (regularization)
    min_child_weight=6,             # minimum samples in a leaf node
                                    #   higher → more conservative (regularization)
    subsample=0.7,                  # use 70% of data for each tree (regularization)
    colsample_bytree=0.7,          # use 70% of features for each tree (regularization)
    eval_metric="auc",              # evaluate using AUC-ROC
    scale_pos_weight=10,            # for IMBALANCED data: ratio of neg/pos samples
                                    #   if 1% fraud → scale_pos_weight ≈ 99
)

Exam scenario: “Fraud detection model, 1% positive class, high accuracy but misses most fraud” → Set scale_pos_weight to the neg/pos ratio AND use AUC/F1 instead of accuracy.

Step 5: Train

xgb.fit(
    inputs={
        "train": TrainingInput(s3_data=train_path, content_type="text/csv"),
        "validation": TrainingInput(s3_data=val_path, content_type="text/csv"),
    },
    wait=True,
    logs="All",
)

print(f"Model artifact: {xgb.model_data}")
# → s3://bucket/xgboost-lab/output/xgboost-2026-04-27-.../output/model.tar.gz

Input Modes — How Data Gets to the Container

┌──────────────┐  Simplest. Downloads entire dataset to disk before
│  FILE MODE   │  training starts. Needs enough EBS storage.
│  (default)   │  Works with any format (CSV, Parquet, images).
└──────────────┘

┌──────────────┐  Fastest. Streams data from S3 as a pipe — no disk
│  PIPE MODE   │  needed. BUT requires RecordIO-Protobuf format and
│              │  sequential reading only (no random access).
└──────────────┘

┌──────────────┐  Best of both. POSIX file mount that loads pages
│ FASTFILE MODE│  from S3 on demand. Any format, random access,
│  (modern)    │  no full download. DEFAULT for new SageMaker versions.
└──────────────┘

Exam decision:
  "Training fails — out of disk space"   → Switch to Pipe or FastFile mode
  "Fastest possible data ingestion"       → Pipe mode + RecordIO
  "Works with any format, no disk issue"  → FastFile mode

Lab 2: Hyperparameter Tuning (HPO)

What Is This Lab About?

You’ve trained an XGBoost model, but how do you know max_depth=5 and eta=0.2 are the best values? You don’t — you guessed. Hyperparameter Tuning systematically searches the parameter space to find the combination that maximizes your metric.

What You’ll Build

┌─────────────────────────────────────────────────────────┐
│              SageMaker Hyperparameter Tuning              │
│                                                         │
│  You define:                                            │
│  • Ranges: max_depth=[3-10], eta=[0.01-0.3], etc.       │
│  • Objective: maximize validation:auc                    │
│  • Budget: 20 trials, 4 parallel                         │
│  • Strategy: Bayesian optimization                       │
│                                                         │
│  SageMaker runs 20 training jobs:                        │
│                                                         │
│  Trial 1:  depth=7, eta=0.15  → AUC=0.82               │
│  Trial 2:  depth=4, eta=0.28  → AUC=0.79               │
│  Trial 3:  depth=5, eta=0.08  → AUC=0.86  ← Bayesian   │
│  Trial 4:  depth=6, eta=0.10  → AUC=0.88    learns      │
│  ...                                         from each   │
│  Trial 20: depth=5, eta=0.09  → AUC=0.91    trial       │
│                                                         │
│  Result: Best hyperparameters found!                     │
│  Deploy best model directly.                             │
└─────────────────────────────────────────────────────────┘

Tuning Strategies Compared

RANDOM SEARCH         BAYESIAN (default)      HYPERBAND
  ● ●   ●              ●                       ●●●●●●●●●●  Start many
  ●  ●  ●  ●           ● → ●                   ●●●●●       Kill bad early
  ● ●  ●               ● → ● → ●               ●●●         Keep promising
  ●   ●  ● ●           ● → ● → ● → ★           ●★          Best survives
                                                            
  Picks randomly.      Each trial informs       Runs many cheap
  Good for initial     the next. Most           trials, allocates
  exploration.         efficient for <100       resources to winners.
                       trials.

Step 1: Define Ranges

from sagemaker.tuner import IntegerParameter, ContinuousParameter, HyperparameterTuner

hyperparameter_ranges = {
    "max_depth":        IntegerParameter(3, 10),
    "eta":              ContinuousParameter(0.01, 0.3),
    "min_child_weight": IntegerParameter(1, 10),
    "subsample":        ContinuousParameter(0.5, 1.0),
    "colsample_bytree": ContinuousParameter(0.5, 1.0),
    "num_round":        IntegerParameter(50, 300),
}

Step 2: Create and Run Tuner

tuner = HyperparameterTuner(
    estimator=xgb,                           # reuse your estimator from Lab 1
    objective_metric_name="validation:auc",  # what to optimize
    objective_type="Maximize",               # higher AUC = better
    hyperparameter_ranges=hyperparameter_ranges,
    max_jobs=20,                             # total trials
    max_parallel_jobs=4,                     # run 4 at a time
    strategy="Bayesian",                     # most efficient strategy
    early_stopping_type="Auto",              # stop bad trials early (saves cost)
)

tuner.fit(
    inputs={
        "train": TrainingInput(s3_data=train_path, content_type="text/csv"),
        "validation": TrainingInput(s3_data=val_path, content_type="text/csv"),
    },
    wait=False,  # don't block — tuning takes a while
)

Step 3: Analyze Results

# Wait for completion
tuner.wait()

# View leaderboard
analytics = sagemaker.HyperparameterTuningJobAnalytics(tuner.latest_tuning_job.job_name)
results = analytics.dataframe().sort_values("FinalObjectiveValue", ascending=False)
print(results[["TrainingJobName", "FinalObjectiveValue",
               "max_depth", "eta", "subsample"]].head(5))

Step 4: Deploy Best Model Directly

predictor = tuner.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.large",
)
result = predictor.predict("25,50000,680,12,0.7,0.3,0.1")

Warm Start — Resume Previous Tuning

If you ran 20 trials and want 20 more, don’t start from scratch. Warm start transfers the knowledge from previous trials.

from sagemaker.tuner import WarmStartConfig, WarmStartTypes

warm_config = WarmStartConfig(
    warm_start_type=WarmStartTypes.IDENTICAL_DATA_AND_ALGORITHM,  # same problem
    parents={tuner.latest_tuning_job.job_name},                    # learn from this
)

tuner_v2 = HyperparameterTuner(
    estimator=xgb,
    objective_metric_name="validation:auc",
    objective_type="Maximize",
    hyperparameter_ranges=hyperparameter_ranges,
    max_jobs=20,
    max_parallel_jobs=4,
    warm_start_config=warm_config,  # starts where v1 left off
)

Warm Start Type	When to Use
`IDENTICAL_DATA_AND_ALGORITHM`	Same data, same algorithm — transfer ALL knowledge
`TRANSFER_LEARNING`	Similar but different problem — transfer PARTIAL knowledge

Lab 3: SageMaker Clarify — Bias Detection & Explainability

What Is This Lab About?

Your loan approval model has 95% accuracy. But is it fair? Does it approve men at a higher rate than women? Does it rely too heavily on zip code (a proxy for race)?

SageMaker Clarify answers these questions:

Pre-training bias: Is the dataset itself biased?
Post-training bias: Are the model’s predictions biased?
SHAP explainability: Why did the model make each specific prediction?

What You’ll Build

┌─────────────────────────────────────────────────────────────┐
│                    SageMaker Clarify                          │
│                                                             │
│  PRE-TRAINING (before training):                            │
│  ┌──────────────────────────────────────────────────┐       │
│  │ "Is the training DATA biased?"                    │       │
│  │                                                   │       │
│  │ Dataset: 10,000 loan applications                 │       │
│  │ Check: gender column                              │       │
│  │                                                   │       │
│  │ Finding: 70% male applicants, 30% female          │       │
│  │ Finding: Approval rate male=80%, female=60%        │       │
│  │ Metric: Class Imbalance = 0.4 (significant)       │       │
│  │ Metric: DPL = 0.2 (20% approval gap)              │       │
│  └──────────────────────────────────────────────────┘       │
│                                                             │
│  POST-TRAINING (after training):                            │
│  ┌──────────────────────────────────────────────────┐       │
│  │ "Are the model's PREDICTIONS biased?"             │       │
│  │                                                   │       │
│  │ Model predicts: male approval = 85%               │       │
│  │ Model predicts: female approval = 55%              │       │
│  │ Metric: Disparate Impact = 0.65 (<0.8 = bias!)    │       │
│  │ The model AMPLIFIED the data bias.                 │       │
│  └──────────────────────────────────────────────────┘       │
│                                                             │
│  EXPLAINABILITY (per prediction):                           │
│  ┌──────────────────────────────────────────────────┐       │
│  │ "WHY was applicant #1234 denied?"                 │       │
│  │                                                   │       │
│  │ Base prediction: 0.50 (50% approval chance)        │       │
│  │ + Income: $120K        → +0.20                     │       │
│  │ + Credit score: 780    → +0.15                     │       │
│  │ - Debt ratio: 0.8      → -0.35  ← main reason     │       │
│  │ - Employment: 6 months → -0.10                     │       │
│  │ = Final: 0.40 (denied)                             │       │
│  │                                                   │       │
│  │ SHAP tells you: "Denied mainly due to high debt"   │       │
│  └──────────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────┘

Step 1: Run Pre-Training Bias Analysis

from sagemaker.clarify import (
    SageMakerClarifyProcessor, DataConfig, BiasConfig,
)

clarify = SageMakerClarifyProcessor(
    role=role, instance_count=1, instance_type="ml.c5.xlarge",
    sagemaker_session=session,
)

data_config = DataConfig(
    s3_data_input_path=f"s3://{bucket}/data/loan_applications.csv",
    s3_output_path=f"s3://{bucket}/clarify/pre-training-report",
    label="approved",                                          # target column
    headers=["age", "gender", "income", "credit_score", "debt_ratio", "approved"],
    dataset_type="text/csv",
)

bias_config = BiasConfig(
    label_values_or_threshold=[1],    # what counts as "positive outcome" (approved)
    facet_name="gender",              # sensitive attribute to check
    facet_values_or_threshold=[0],    # reference group (0=male in this encoding)
)

clarify.run_pre_training_bias(
    data_config=data_config,
    data_bias_config=bias_config,
    wait=True,
)

Output report includes these metrics:

Metric	What It Measures	Flag If
Class Imbalance (CI)	Difference in group sizes	>0.3
DPL	Difference in approval rates between groups	>0.1
KL Divergence	How different the feature distributions are	>0.1
Jensen-Shannon	Symmetric version of KL (bounded 0-1)	>0.1

Step 2: Run Post-Training Bias + SHAP

After training your model, check if its predictions are fair AND get explanations.

from sagemaker.clarify import ModelConfig, SHAPConfig

model_config = ModelConfig(
    model_name="loan-approval-model",
    instance_type="ml.m5.xlarge",
    instance_count=1,
    content_type="text/csv",
)

# SHAP needs a "baseline" — typically the dataset median
# It measures how each feature SHIFTS the prediction away from this baseline
shap_config = SHAPConfig(
    baseline=[[35, 0, 60000, 700, 0.4]],  # median values for each feature
    num_samples=500,                        # more = more accurate, slower
    agg_method="mean_abs",                  # global importance aggregation
    save_local_shap_values=True,            # save per-prediction explanations
)

clarify.run_explainability(
    data_config=data_config,
    model_config=model_config,
    explainability_config=shap_config,
    wait=True,
)

How to Read SHAP Output

Global feature importance (mean |SHAP|):
  credit_score:   0.28  ████████████████████████████
  income:         0.22  ██████████████████████
  debt_ratio:     0.19  ███████████████████
  age:            0.08  ████████
  gender:         0.03  ███

→ Credit score matters most. Gender matters least (good — low bias risk).

Local explanation (applicant #1234):
  Base:           0.50
  credit_score:  +0.15  (780 is high → helps approval)
  income:        +0.20  ($120K is high → helps)
  debt_ratio:    -0.35  (0.8 is very high → hurts approval most)
  age:           +0.02
  gender:        -0.02
  Final:          0.40  → Denied

→ "Denied primarily because of high debt-to-income ratio"

Lab 4: Amazon Bedrock — Foundation Models & RAG

Bedrock RAG Architecture

What Is This Lab About?

Not every ML problem needs custom training. If you need a chatbot, a document Q&A system, or content generation, Amazon Bedrock provides access to foundation models (Claude, Llama, Titan, etc.) through a simple API — no infrastructure, no training, no GPUs.

This lab covers the three most exam-relevant Bedrock patterns: text generation, RAG (retrieval-augmented generation), and guardrails.

What You’ll Build

PATTERN 1: Direct Inference
┌──────────┐     ┌─────────────┐     ┌───────────────┐
│  Your    │────▶│  Bedrock    │────▶│  Claude/Llama  │────▶ Answer
│  Prompt   │     │  Runtime    │     │  Foundation    │
│           │     │  API        │     │  Model         │
└──────────┘     └─────────────┘     └───────────────┘

PATTERN 2: RAG (Retrieval-Augmented Generation)
┌──────────┐     ┌─────────────────────────────────────────────────┐
│ "What is  │     │  Bedrock Knowledge Base                         │
│  our      │────▶│                                                 │
│  refund   │     │  1. Embed question → vector                     │
│  policy?" │     │  2. Search vector store for similar chunks       │
│           │     │  3. Retrieve top 5 relevant paragraphs          │
│           │     │  4. Inject into prompt: "Based on these docs..." │
│           │     │  5. FM generates grounded answer                 │
│           │     │  6. Return answer + source citations             │
└──────────┘     └─────────────────────────────────────────────────┘
                    ↕                    ↕
              ┌──────────┐      ┌──────────────────┐
              │  S3      │      │ OpenSearch        │
              │ (your    │─────▶│ Serverless        │
              │  docs)   │      │ (vector store)    │
              └──────────┘      └──────────────────┘

PATTERN 3: Guardrails
┌──────────┐     ┌─────────────┐     ┌──────────┐     ┌─────────────┐
│  User    │────▶│  INPUT      │────▶│  FM      │────▶│  OUTPUT     │────▶ Safe
│  message  │     │  GUARDRAIL  │     │  (Claude) │     │  GUARDRAIL  │     response
│           │     │  Block hate │     │           │     │  Redact PII │
│           │     │  Block PII  │     │           │     │  Block harm │
└──────────┘     └─────────────┘     └──────────┘     └─────────────┘

Step 1: Direct Text Generation

import boto3
import json

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")

response = bedrock.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    contentType="application/json",
    accept="application/json",
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1024,
        "temperature": 0.7,     # 0=deterministic, 1=creative
        "messages": [
            {"role": "user", "content": "Explain gradient boosting in 3 sentences."}
        ],
    }),
)

result = json.loads(response["body"].read())
print(result["content"][0]["text"])

Temperature controls randomness:

Temperature	Behavior	Use Case
0.0	Always same answer (deterministic)	Factual Q&A, classification
0.3-0.5	Mostly consistent, slight variation	Summarization, analysis
0.7-1.0	Creative, diverse outputs	Content generation, brainstorming

Step 2: Generate Embeddings (For RAG)

Before you can search documents semantically, you need to convert text into vectors (embeddings). Similar meanings produce similar vectors.

response = bedrock.invoke_model(
    modelId="amazon.titan-embed-text-v2:0",
    contentType="application/json",
    accept="application/json",
    body=json.dumps({
        "inputText": "What is the company refund policy?",
        "dimensions": 256,
    }),
)

result = json.loads(response["body"].read())
embedding = result["embedding"]
print(f"Vector with {len(embedding)} dimensions")
# This vector can be stored in OpenSearch / pgvector / Pinecone
# and searched by similarity with other vectors

Step 3: Query a Knowledge Base (RAG)

After setting up a Knowledge Base (S3 docs → chunked → embedded → indexed in OpenSearch), you can query it. The Knowledge Base automatically retrieves relevant chunks and augments the prompt.

bedrock_agent = boto3.client("bedrock-agent-runtime", region_name="us-east-1")

response = bedrock_agent.retrieve_and_generate(
    input={"text": "What is our refund policy for software licenses?"},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": "YOUR_KB_ID",
            "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        },
    },
)

# The answer is grounded in YOUR documents
print("Answer:", response["output"]["text"])

# See which documents were used (citations)
for citation in response.get("citations", []):
    for ref in citation.get("retrievedReferences", []):
        print(f"  Source: {ref['location']['s3Location']['uri']}")
        print(f"  Chunk: {ref['content']['text'][:100]}...")

When RAG vs Fine-Tuning?

"Model needs to KNOW new facts"
  → Has my company's specific policies/data?
    YES → RAG (retrieve docs at query time)
    
"Model needs to BEHAVE differently"
  → Should it respond in legal language? Medical format?
    YES → Fine-tuning (change model weights)

"Model generates wrong/harmful content"
  → Block topics, filter PII, enforce safety?
    YES → Guardrails (applied at input/output)

Approach	Changes	Data Needed	Cost	Update Speed
RAG	Nothing in model	Documents in S3	$ (inference only)	Instant (update docs)
Fine-tuning	Model weights	Labeled examples (JSONL)	$$$ (training compute)	Hours (retrain)
Prompt engineering	Nothing	Zero examples	$	Instant

Domain 2 Lab Summary

Lab	Service	You Learned
1	XGBoost Training	Instance types, hyperparameters, spot training, input modes, container paths
2	HPO	Bayesian vs Random vs Hyperband, ranges, warm start, early stopping
3	Clarify	Pre/post-training bias metrics, SHAP explainability, reading SHAP output
4	Bedrock	Text generation, embeddings, RAG with Knowledge Bases, RAG vs fine-tuning

Lab: Domain 2 — Model Development Hands-On#

Lab 1: Train an XGBoost Model on SageMaker#

What Is This Lab About?#

What You’ll Build#

Why This Matters for the Exam#

What’s ACTUALLY Happening When You Call .fit()?#

Step 1: Setup#

Step 2: Upload Training Data#

Step 3: Configure the Estimator#

Step 4: Set Hyperparameters#

Step 5: Train#

Input Modes — How Data Gets to the Container#

Lab 2: Hyperparameter Tuning (HPO)#

What Is This Lab About?#

What You’ll Build#

Tuning Strategies Compared#

Step 1: Define Ranges#

Step 2: Create and Run Tuner#

Step 3: Analyze Results#

Step 4: Deploy Best Model Directly#

Warm Start — Resume Previous Tuning#

Lab 3: SageMaker Clarify — Bias Detection & Explainability#

What Is This Lab About?#

What You’ll Build#

Step 1: Run Pre-Training Bias Analysis#

Step 2: Run Post-Training Bias + SHAP#

How to Read SHAP Output#

Lab 4: Amazon Bedrock — Foundation Models & RAG#

What Is This Lab About?#

What You’ll Build#

Step 1: Direct Text Generation#

Step 2: Generate Embeddings (For RAG)#

Step 3: Query a Knowledge Base (RAG)#

When RAG vs Fine-Tuning?#

Domain 2 Lab Summary#

Lab: Domain 2 — Model Development Hands-On

Lab 1: Train an XGBoost Model on SageMaker

What Is This Lab About?

What You’ll Build

Why This Matters for the Exam

What’s ACTUALLY Happening When You Call `.fit()`?

Step 1: Setup

Step 2: Upload Training Data

Step 3: Configure the Estimator

Step 4: Set Hyperparameters

Step 5: Train

Input Modes — How Data Gets to the Container

Lab 2: Hyperparameter Tuning (HPO)

What Is This Lab About?

What You’ll Build

Tuning Strategies Compared

Step 1: Define Ranges

Step 2: Create and Run Tuner

Step 3: Analyze Results

Step 4: Deploy Best Model Directly

Warm Start — Resume Previous Tuning

Lab 3: SageMaker Clarify — Bias Detection & Explainability

What Is This Lab About?

What You’ll Build

Step 1: Run Pre-Training Bias Analysis

Step 2: Run Post-Training Bias + SHAP

How to Read SHAP Output

Lab 4: Amazon Bedrock — Foundation Models & RAG

What Is This Lab About?

What You’ll Build

Step 1: Direct Text Generation

Step 2: Generate Embeddings (For RAG)

Step 3: Query a Knowledge Base (RAG)

When RAG vs Fine-Tuning?

Domain 2 Lab Summary