← AWS MLA-C01 — ML Engineer Associate

MLA-C01 Scenario Patterns & Exam Traps

Exam Scenario Patterns & Traps

How to think through MLA-C01 questions. Learn the patterns, not just the facts.


How to Read Exam Questions

Most questions follow this structure:
  1. SCENARIO — A company/team needs to do X
  2. CONSTRAINT — With requirement Y (cost, latency, compliance)
  3. OPTIONS — 4 answers (2 clearly wrong + 2 plausible)
  4. KEY PHRASE — "MOST cost-effective", "LEAST overhead", "BEST performance"

Critical reading:
  "MOST cost-effective"        = cheapest that meets requirements
  "LEAST operational overhead" = most managed/serverless (NOT cheapest)
  "LOWEST latency"            = fastest response (may cost more)
  "MINIMIZE changes"          = use existing tools, avoid migration

Category 1: Which Endpoint?

Scenario KeywordsAnswer
“Consistent traffic, <100ms, real-time API”Real-Time Endpoint
“5 requests/day, cost-sensitive, minimize cost”Serverless Inference
“2GB video, 10 min processing, >6MB payload”Async Inference
“Weekly batch scoring, 1M records, scheduled”Batch Transform
“500 models, 10 requests each/day”Multi-Model Endpoint
“Compare 2 models, 10% vs 90% traffic”Production Variants
“Test new model silently, zero user risk”Shadow Testing

Real-time endpoints cannot scale to 0. Min = 1 instance. For scale-to-0, use Serverless or Async.


Category 2: Which Data Service?

ScenarioAnswerNOT
“Simple ETL, serverless”Glue ETLEMR (overkill)
“Petabyte-scale Spark”EMRGlue (not enough scale)
“No-code data prep for analysts”Glue DataBrewData Wrangler (ML-focused)
“ML-specific data prep in SageMaker”Data WranglerDataBrew (not ML-specific)
“SQL on S3”AthenaRedshift (no warehouse needed)
“Real-time streaming + custom processing”Kinesis Data StreamsFirehose (no custom consumers)
“Deliver stream to S3”Data FirehoseData Streams (simpler)
“Migrate existing Kafka”MSKKinesis (different API)
“Central feature repo, training + serving”Feature StoreS3 versioning
“Data warehouse with ML predictions”Redshift MLAthena

Category 3: Which Algorithm?

ScenarioAnswerKey Distinction
“Tabular classification/regression”XGBoostDefault choice for structured data
“Many categorical features, tabular”CatBoostNative categorical support
“AutoML, no tuning, highest accuracy”AutoGluon-TabularEnsemble stacking
“Time series, 5000 products, forecast”DeepARMultiple related time series
“Anomaly detection, no labels, streaming”Random Cut ForestUnsupervised anomaly
“Suspicious login IPs”IP InsightsEntity-IP associations
“Sentiment analysis, custom data”BlazingTextText classification with training
“Sentiment analysis, no training needed”ComprehendManaged AI service
“Pixel-level image segmentation”Semantic SegmentationPer-pixel labels
“Sparse click data, recommendations”Factorization MachinesPairwise interactions
“Managed recommendations, no ML needed”PersonalizeManaged service
“Document OCR, forms, tables”TextractNo training needed
“Custom image classifier, 10 images”Rekognition Custom LabelsFew images

Category 4: Training Troubleshooting

Error / SymptomCauseFix
“Insufficient disk space”File mode, data too largePipe Mode or FastFile Mode, or increase VolumeSizeInGB
“Cannot pull Docker image”Missing ECR permissions/VPC endpointAdd ECR VPC endpoints + IAM
“Cannot access S3 data”Missing S3 permission/VPC endpointAdd S3 VPC endpoint + IAM
“CUDA out of memory”Model too large for GPUSmaller batch size, Model Parallelism, or larger GPU
“MaxWaitTimeExceeded”Spot instance not availableIncrease max_wait or switch to On-Demand
“Loss not decreasing”Learning rate issueCheck LR (too high/low), check data
“Training 10x slower than expected”Wrong instance typeCPU for deep learning → switch to GPU
“Spot training restarts from scratch”No checkpointing configuredEnable checkpoint_s3_uri
“GPU utilization only 20%”I/O bottleneckUse Pipe mode, increase batch size, more data workers
“Training costs too high”On-demand instancesSpot Training + Checkpointing (up to 90% savings)

Category 5: Monitoring & Drift

ScenarioAnswerKey Detail
“Feature mean shifted from $50K to $80K”Data Quality MonitorNO ground truth needed
“Accuracy dropped from 95% to 80%”Model Quality MonitorNEEDS ground truth labels
“Approving men at higher rate than women”Bias Drift MonitorUses Clarify baseline
“Feature importance shifted”Explainability DriftSHAP value changes
“Explain why loan was denied”Clarify SHAPLocal explanation
“Set up complete monitoring”Data Capture → Baseline → Schedule → Alarms → Remediation
“Automate retraining on drift”Monitor → CloudWatch → EventBridge → Pipeline

Critical: Model Monitor requires Data Capture to be enabled first. Without it = no data to monitor.


Category 6: Security & Compliance

ScenarioAnswer
“HIPAA-compliant ML pipeline”KMS + TLS + VPC + Inter-container encryption + CloudTrail
“Data must not leave AWS network”VPC Endpoints + Private Subnets
“Data scientists can train but not deploy”IAM policies (allow CreateTrainingJob, deny CreateEndpoint)
“Encrypt with company-managed keys”KMS Customer Managed Key (CMK)
“Prevent training container from internet”Network Isolation mode
“Find PII in S3”Macie
“Handle PII in ETL”Glue DataBrew
“Audit who accessed what”CloudTrail
“Rotate database credentials”Secrets Manager

Category 7: Cost Optimization

Current SetupOptimization
On-Demand training, 12 hoursSpot Training + Checkpointing (90% savings)
GPU for XGBoostSwitch to CPU (ml.m5) — XGBoost doesn’t benefit
Always-on endpoint, 5 req/dayServerless Inference
GPU inference (ml.p3)ml.inf1/inf2 (Inferentia) or ml.g4dn
Pipeline reprocesses all dataStep Caching
500 individual endpointsMulti-Model Endpoint
Training restarts frequentlyWarm Pools
Athena scanning large CSVsConvert to Parquet + Partition
Large FM inference costsBedrock Intelligent Prompt Routing (30% savings)

Category 8: Generative AI / Bedrock

ScenarioAnswerKey Distinction
“Q&A over company documents”Bedrock Knowledge Bases (RAG)Retrieve + generate
“AI that understands legal terminology”Bedrock fine-tuningChange model knowledge
“Block offensive AI outputs + PII”Bedrock GuardrailsContent safety
“Multi-step task automation with LLM”Bedrock AgentsAction Groups + Lambda
“Compare FM output quality”Bedrock Model EvaluationAuto or human eval
“Reduce FM inference cost”Intelligent Prompt RoutingAuto-select model size

Fine-tuning = change model behavior/knowledge. RAG = provide context at inference.


Category 9: Feature Management

ScenarioAnswer
“Reuse features across multiple teams”Feature Store with Feature Groups
“Real-time feature lookup for inference (<10ms)”Feature Store Online Store
“Prevent future data from leaking into training”Feature Store Offline with Point-in-Time
“Both training and real-time serving from same features”Enable both Online + Offline stores

Category 10: Transfer Learning vs Incremental

Signal WordsAnswer
“limited data”, “pre-trained”, “new task”, “similar domain”Transfer Learning
“new data arrives”, “without retraining”, “update existing model”Incremental Learning
“distribution shifted significantly”Full Retrain (old + new data)

Top 10 Exam Traps

1. Over-Engineering

Weekly batch → answer offers Real-Time Endpoint. Pick Batch Transform.

2. “Operational Overhead” ≠ “Cost”

“Least overhead” = most managed/serverless. “Most cost-effective” = cheapest total. May be different answers.

3. Missing Configuration

Spot without checkpoint = loses progress. Monitor without Data Capture = no data. Model Quality without ground truth = can’t measure.

4. Service Confusion

  • Processing ≠ Training
  • Data Quality ≠ Model Quality
  • Clarify ≠ Model Monitor
  • Pipelines ≠ Step Functions
  • Autopilot ≠ Automatic Model Tuning

5. XGBoost Doesn’t Need GPU

“Train XGBoost, choose instance” → ml.m5 (CPU). GPU for deep learning only.

6. Lambda Limits

Process 2GB file? Lambda can’t (250MB limit). Use ECS/Fargate/SageMaker Processing.

7. Accuracy Is Misleading

99.5% accuracy on 99.5/0.5% imbalanced data = useless. Use F1, AUC-ROC, Precision-Recall.

8. AdaBoost on Noisy/Imbalanced Data

AdaBoost upweights noisy misclassified samples → overfits. No regularization. XGBoost = correct (scale_pos_weight + L1/L2 + missing value handling).

9. Random K-Fold on Time Series

Trains on future, tests on past = leakage. Use walk-forward (time-series split).

10. Bedrock vs SageMaker

GenAI apps (chat, RAG, content) → Bedrock. Custom model training/deployment → SageMaker.


Augmented Manifest (Ground Truth Output)

{"source-ref": "s3://bucket/image1.jpg", "label": 0, "label-metadata": {"confidence": 0.95, "human-annotated": "yes"}}
{"source-ref": "s3://bucket/image2.jpg", "label": 1, "label-metadata": {"confidence": 0.88, "human-annotated": "no"}}
  • JSON Lines format — directly consumable by SageMaker training
  • “human-annotated”: “no” = auto-labeled by active learning