Grain is the first question to answer when designing any table. It defines what one row represents.
If grain is unclear, you will get duplicates, incorrect aggregations, and ambiguous join keys. Every interview answer about fact tables should start with grain.
Common Grain Examples
One row per order line item
One row per user per day
One row per click event
One row per account per month
💡 Interview Gold
"The grain determines which metrics are additive, how joins behave, and what can be uniquely identified. Always state the grain before discussing any model design."
Fact Tables vs Dimension Tables
These two pillars form the backbone of dimensional modeling. Know the distinction cold.
📊 Fact Tables
📌 Store measurable events (orders, clicks, payments)
📌 Contain foreign keys to dimensions
📌 Hold numeric measures (amount, quantity, duration)
📌 Typically the largest tables in the warehouse
📌 Answer: "what happened?"
📌 Grain defines what one row means
🏷️ Dimension Tables
📌 Store descriptive attributes (user, product, date)
📌 Provide context for slicing and filtering
📌 Hold text attributes (name, category, region)
📌 Typically wide with many columns, fewer rows
📌 Answer: "who / what / where / when?"
📌 Usually have a surrogate primary key
🎯 Interview Tip
"Facts are verbs — they record events. Dimensions are nouns — they describe the context of those events. A well-designed model joins many-to-one from fact to dimension."
Types of Fact Tables
Interviewers often ask you to pick the right fact type for a scenario. Know all three:
📝 Transactional Fact
One row per business event. Most common type. Grain = individual transaction (one order line, one click, one payment). Fully additive measures.
📸 Periodic Snapshot
One row per entity per time period. Captures state at regular intervals. Grain = account per month, inventory per day. Semi-additive measures (can't sum balances across time).
🔄 Accumulating Snapshot
One row per process lifecycle. Updated as milestones complete. Grain = one order from creation to delivery. Multiple date columns (ordered_at, shipped_at, delivered_at).
-- Transactional fact: one row per order lineSELECT order_line_id, order_id, product_key, customer_key,
date_key, quantity, unit_price, discount_amount
FROM fact_order_lines;
-- Periodic snapshot: one row per account per monthSELECT account_key, month_key, ending_balance,
total_deposits, total_withdrawals
FROM fact_account_monthly_snapshot;
-- Accumulating snapshot: one row per order lifecycleSELECT order_key, ordered_date_key, shipped_date_key,
delivered_date_key, returned_date_key, current_status
FROM fact_order_pipeline;
Factless Fact Tables & Degenerate Dimensions
Two patterns that trip up candidates — know them before the interview.
📋 Factless Fact Tables
📌 Record events with no numeric measures
📌 The fact itself is the event occurrence
📌 Example: "student attended class" — no amount, just the relationship
📌 Used for coverage analysis, eligibility, event tracking
📌 Avoids creating a dimension with a single column
💡 Interview Gold
"A factless fact table captures relationships or events without measures — like campaign eligibility or class attendance. A degenerate dimension is an identifier that lives directly in the fact table because it has no descriptive attributes to justify a separate dimension."
Additive vs Semi-Additive vs Non-Additive
Measure additivity determines how you can aggregate a fact. Mess this up and your reports show nonsensical numbers.
✅ Fully Additive
Can be summed across all dimensions. Revenue, quantity, discount_amount. Most measures in transactional facts are additive.
⚠️ Semi-Additive
Can be summed across some dimensions but NOT time. Account balance, inventory count. Sum across accounts is fine; sum across months is meaningless.
❌ Non-Additive
Cannot be summed at all. Ratios, percentages, unit prices. Must recalculate from components. Average price = SUM(revenue) / SUM(quantity).
-- Additive: SUM works across all dimensionsSELECT region, SUM(revenue) FROM fact_sales GROUP BY region; -- ✅-- Semi-additive: SUM across accounts OK, across time NOT OKSELECT branch, SUM(ending_balance) FROM fact_account_snapshot
WHERE month_key = '2024-12'GROUP BY branch; -- ✅ (single month)-- Non-additive: never SUM a ratio directlySELECT region, SUM(revenue) / SUM(quantity) AS avg_price
FROM fact_sales GROUP BY region; -- ✅ (recalculated from components)
🎯 Interview Tip
"Never store pre-calculated ratios in fact tables. Store the numerator and denominator separately so they remain additive. The BI layer or query computes the ratio at report time."
Quiz: Test Yourself
Q1: A colleague designed a fact table but can't explain what one row represents. What's missing?
Q2: You need to track daily inventory levels for each warehouse. Which fact type fits best?
Q3: A monthly account balance (e.g., $50K in Jan, $60K in Feb) is which type of measure?
Q4: You need to track which students attended which classes. There's no dollar amount or quantity. What do you build?
Q5: Where should order_number live if it has no attributes beyond itself?