This is the final test. 15 tricky scenario questions, debugging exercises, and gotchas that trip up experienced engineers. Read the scenario, think, then reveal the answer.
For each scenario, pause and formulate your answer before clicking "Reveal." In a real interview, you need to think out loud — practice that here.
"Your team's daily Spark ETL job costs $10,000/day. The CEO wants it halved. Where do you start?"
"Your S3 storage bill tripled in 3 months but data volume only grew 20%. What happened?"
"Your always-on Spark cluster averages 15% CPU utilization but the team insists they need it 24/7. What do you do?"
"An analyst ran SELECT * FROM a 50 TB table in BigQuery (on-demand pricing). The query cost $250. How do you prevent this?"
maximumBytesBilled per query (e.g., 1 TB limit). 2) Require partition filters on large tables. 3) Create authorized views that pre-filter data. 4) Set up project-level cost alerts. 5) Consider switching heavy users to flat-rate BigQuery slots. 6) Educate users: always use WHERE clauses and SELECT only needed columns."A backfill of 2 years of data was estimated at $3K but actually cost $18K. What went wrong?"
"A join between a 200 GB table and a 500 GB table takes 4 hours. Both are well-partitioned. What do you check?"
"Your Kafka-to-Delta streaming job produces 50K files/day. Storage costs are growing 5x faster than expected."
autoOptimize.autoCompact). 3) Schedule OPTIMIZE to run hourly/daily. 4) Use optimizeWrite to coalesce partitions during write. 5) Run VACUUM regularly to clean old files. Going from 50K files/day to ~200 files/day can cut storage overhead by 90%.These traps catch even experienced engineers. Know them before your interview.
Scans every column in columnar formats. A 50-column Parquet table scanned fully costs 10x vs selecting 5 columns.
Partitioning by user_id (10M users) creates 10M micro-partitions. Use Z-ordering or bucketing instead.
Delta Lake keeps old file versions. Without VACUUM, storage doubles every few weeks from retained history.
200 partitions for 1 TB = 5 GB per partition (OOM risk). 200 partitions for 1 GB = 5 MB each (overhead waste). Tune it.
A cluster running 24/7 at 15% utilization wastes 85% of spend. Auto-terminate + auto-scale saves 60-80%.
Reprocessing 2 TB daily when only 5 GB is new. Incremental processing would be 400x cheaper.
If you scored 8+, you're interview-ready on cost optimization. 5-7: review Modules 2-5. Under 5: restart from Module 1 and take notes on each callout box. Remember: interviewers want to hear structured thinking, not just the right answer.
When answering cost questions, always use this structure: 1) Identify the cost driver. 2) Explain why it's expensive. 3) Propose 2-3 concrete optimizations with trade-offs. 4) Quantify the expected savings if possible.