MLA-C01 Scenario Patterns & Exam Traps

6 min read 1206 words

Exam Scenario Patterns & Traps

How to think through MLA-C01 questions. Learn the patterns, not just the facts.

How to Read Exam Questions

Most questions follow this structure:
  1. SCENARIO — A company/team needs to do X
  2. CONSTRAINT — With requirement Y (cost, latency, compliance)
  3. OPTIONS — 4 answers (2 clearly wrong + 2 plausible)
  4. KEY PHRASE — "MOST cost-effective", "LEAST overhead", "BEST performance"

Critical reading:
  "MOST cost-effective"        = cheapest that meets requirements
  "LEAST operational overhead" = most managed/serverless (NOT cheapest)
  "LOWEST latency"            = fastest response (may cost more)
  "MINIMIZE changes"          = use existing tools, avoid migration

Category 1: Which Endpoint?

Scenario Keywords	Answer
“Consistent traffic, <100ms, real-time API”	Real-Time Endpoint
“5 requests/day, cost-sensitive, minimize cost”	Serverless Inference
“2GB video, 10 min processing, >6MB payload”	Async Inference
“Weekly batch scoring, 1M records, scheduled”	Batch Transform
“500 models, 10 requests each/day”	Multi-Model Endpoint
“Compare 2 models, 10% vs 90% traffic”	Production Variants
“Test new model silently, zero user risk”	Shadow Testing

Real-time endpoints cannot scale to 0. Min = 1 instance. For scale-to-0, use Serverless or Async.

Category 2: Which Data Service?

Scenario	Answer	NOT
“Simple ETL, serverless”	Glue ETL	EMR (overkill)
“Petabyte-scale Spark”	EMR	Glue (not enough scale)
“No-code data prep for analysts”	Glue DataBrew	Data Wrangler (ML-focused)
“ML-specific data prep in SageMaker”	Data Wrangler	DataBrew (not ML-specific)
“SQL on S3”	Athena	Redshift (no warehouse needed)
“Real-time streaming + custom processing”	Kinesis Data Streams	Firehose (no custom consumers)
“Deliver stream to S3”	Data Firehose	Data Streams (simpler)
“Migrate existing Kafka”	MSK	Kinesis (different API)
“Central feature repo, training + serving”	Feature Store	S3 versioning
“Data warehouse with ML predictions”	Redshift ML	Athena

Category 3: Which Algorithm?

Scenario	Answer	Key Distinction
“Tabular classification/regression”	XGBoost	Default choice for structured data
“Many categorical features, tabular”	CatBoost	Native categorical support
“AutoML, no tuning, highest accuracy”	AutoGluon-Tabular	Ensemble stacking
“Time series, 5000 products, forecast”	DeepAR	Multiple related time series
“Anomaly detection, no labels, streaming”	Random Cut Forest	Unsupervised anomaly
“Suspicious login IPs”	IP Insights	Entity-IP associations
“Sentiment analysis, custom data”	BlazingText	Text classification with training
“Sentiment analysis, no training needed”	Comprehend	Managed AI service
“Pixel-level image segmentation”	Semantic Segmentation	Per-pixel labels
“Sparse click data, recommendations”	Factorization Machines	Pairwise interactions
“Managed recommendations, no ML needed”	Personalize	Managed service
“Document OCR, forms, tables”	Textract	No training needed
“Custom image classifier, 10 images”	Rekognition Custom Labels	Few images

Category 4: Training Troubleshooting

Error / Symptom	Cause	Fix
“Insufficient disk space”	File mode, data too large	Pipe Mode or FastFile Mode, or increase VolumeSizeInGB
“Cannot pull Docker image”	Missing ECR permissions/VPC endpoint	Add ECR VPC endpoints + IAM
“Cannot access S3 data”	Missing S3 permission/VPC endpoint	Add S3 VPC endpoint + IAM
“CUDA out of memory”	Model too large for GPU	Smaller batch size, Model Parallelism, or larger GPU
“MaxWaitTimeExceeded”	Spot instance not available	Increase max_wait or switch to On-Demand
“Loss not decreasing”	Learning rate issue	Check LR (too high/low), check data
“Training 10x slower than expected”	Wrong instance type	CPU for deep learning → switch to GPU
“Spot training restarts from scratch”	No checkpointing configured	Enable checkpoint_s3_uri
“GPU utilization only 20%”	I/O bottleneck	Use Pipe mode, increase batch size, more data workers
“Training costs too high”	On-demand instances	Spot Training + Checkpointing (up to 90% savings)

Category 5: Monitoring & Drift

Scenario	Answer	Key Detail
“Feature mean shifted from $50K to $80K”	Data Quality Monitor	NO ground truth needed
“Accuracy dropped from 95% to 80%”	Model Quality Monitor	NEEDS ground truth labels
“Approving men at higher rate than women”	Bias Drift Monitor	Uses Clarify baseline
“Feature importance shifted”	Explainability Drift	SHAP value changes
“Explain why loan was denied”	Clarify SHAP	Local explanation
“Set up complete monitoring”	Data Capture → Baseline → Schedule → Alarms → Remediation
“Automate retraining on drift”	Monitor → CloudWatch → EventBridge → Pipeline

Critical: Model Monitor requires Data Capture to be enabled first. Without it = no data to monitor.

Category 6: Security & Compliance

Scenario	Answer
“HIPAA-compliant ML pipeline”	KMS + TLS + VPC + Inter-container encryption + CloudTrail
“Data must not leave AWS network”	VPC Endpoints + Private Subnets
“Data scientists can train but not deploy”	IAM policies (allow CreateTrainingJob, deny CreateEndpoint)
“Encrypt with company-managed keys”	KMS Customer Managed Key (CMK)
“Prevent training container from internet”	Network Isolation mode
“Find PII in S3”	Macie
“Handle PII in ETL”	Glue DataBrew
“Audit who accessed what”	CloudTrail
“Rotate database credentials”	Secrets Manager

Category 7: Cost Optimization

Current Setup	Optimization
On-Demand training, 12 hours	Spot Training + Checkpointing (90% savings)
GPU for XGBoost	Switch to CPU (ml.m5) — XGBoost doesn’t benefit
Always-on endpoint, 5 req/day	Serverless Inference
GPU inference (ml.p3)	ml.inf1/inf2 (Inferentia) or ml.g4dn
Pipeline reprocesses all data	Step Caching
500 individual endpoints	Multi-Model Endpoint
Training restarts frequently	Warm Pools
Athena scanning large CSVs	Convert to Parquet + Partition
Large FM inference costs	Bedrock Intelligent Prompt Routing (30% savings)

Category 8: Generative AI / Bedrock

Scenario	Answer	Key Distinction
“Q&A over company documents”	Bedrock Knowledge Bases (RAG)	Retrieve + generate
“AI that understands legal terminology”	Bedrock fine-tuning	Change model knowledge
“Block offensive AI outputs + PII”	Bedrock Guardrails	Content safety
“Multi-step task automation with LLM”	Bedrock Agents	Action Groups + Lambda
“Compare FM output quality”	Bedrock Model Evaluation	Auto or human eval
“Reduce FM inference cost”	Intelligent Prompt Routing	Auto-select model size

Fine-tuning = change model behavior/knowledge. RAG = provide context at inference.

Category 9: Feature Management

Scenario	Answer
“Reuse features across multiple teams”	Feature Store with Feature Groups
“Real-time feature lookup for inference (<10ms)”	Feature Store Online Store
“Prevent future data from leaking into training”	Feature Store Offline with Point-in-Time
“Both training and real-time serving from same features”	Enable both Online + Offline stores

Category 10: Transfer Learning vs Incremental

Signal Words	Answer
“limited data”, “pre-trained”, “new task”, “similar domain”	Transfer Learning
“new data arrives”, “without retraining”, “update existing model”	Incremental Learning
“distribution shifted significantly”	Full Retrain (old + new data)

Top 10 Exam Traps

1. Over-Engineering

Weekly batch → answer offers Real-Time Endpoint. Pick Batch Transform.

2. “Operational Overhead” ≠ “Cost”

“Least overhead” = most managed/serverless. “Most cost-effective” = cheapest total. May be different answers.

3. Missing Configuration

Spot without checkpoint = loses progress. Monitor without Data Capture = no data. Model Quality without ground truth = can’t measure.

4. Service Confusion

Processing ≠ Training
Data Quality ≠ Model Quality
Clarify ≠ Model Monitor
Pipelines ≠ Step Functions
Autopilot ≠ Automatic Model Tuning

5. XGBoost Doesn’t Need GPU

“Train XGBoost, choose instance” → ml.m5 (CPU). GPU for deep learning only.

6. Lambda Limits

Process 2GB file? Lambda can’t (250MB limit). Use ECS/Fargate/SageMaker Processing.

7. Accuracy Is Misleading

99.5% accuracy on 99.5/0.5% imbalanced data = useless. Use F1, AUC-ROC, Precision-Recall.

8. AdaBoost on Noisy/Imbalanced Data

AdaBoost upweights noisy misclassified samples → overfits. No regularization. XGBoost = correct (scale_pos_weight + L1/L2 + missing value handling).

9. Random K-Fold on Time Series

Trains on future, tests on past = leakage. Use walk-forward (time-series split).

10. Bedrock vs SageMaker

GenAI apps (chat, RAG, content) → Bedrock. Custom model training/deployment → SageMaker.

Augmented Manifest (Ground Truth Output)

{"source-ref": "s3://bucket/image1.jpg", "label": 0, "label-metadata": {"confidence": 0.95, "human-annotated": "yes"}}
{"source-ref": "s3://bucket/image2.jpg", "label": 1, "label-metadata": {"confidence": 0.88, "human-annotated": "no"}}

JSON Lines format — directly consumable by SageMaker training
“human-annotated”: “no” = auto-labeled by active learning

Exam Scenario Patterns & Traps#

How to Read Exam Questions#

Category 1: Which Endpoint?#

Category 2: Which Data Service?#

Category 3: Which Algorithm?#

Category 4: Training Troubleshooting#

Category 5: Monitoring & Drift#

Category 6: Security & Compliance#

Category 7: Cost Optimization#

Category 8: Generative AI / Bedrock#

Category 9: Feature Management#

Category 10: Transfer Learning vs Incremental#

Top 10 Exam Traps#

1. Over-Engineering#

2. “Operational Overhead” ≠ “Cost”#

3. Missing Configuration#

4. Service Confusion#

5. XGBoost Doesn’t Need GPU#

6. Lambda Limits#

7. Accuracy Is Misleading#

8. AdaBoost on Noisy/Imbalanced Data#

9. Random K-Fold on Time Series#

10. Bedrock vs SageMaker#

Augmented Manifest (Ground Truth Output)#