The star schema is the most widely used dimensional modeling pattern. A central fact table connects to multiple dimension tables in a hub-and-spoke layout.
"Star schemas optimize for query simplicity and read performance. All joins are fact-to-dimension (many-to-one), which BI tools understand natively. Denormalized dimensions mean fewer joins per query."
A snowflake schema normalizes dimensions into sub-tables. Instead of one flat dim_product, you split into dim_product → dim_category → dim_department.
A conformed dimension is the same dimension used consistently across multiple fact tables. This is what makes enterprise-wide analytics possible.
Shared across fact_sales, fact_support_tickets, fact_web_visits. Same date_key, same fiscal calendar attributes everywhere.
Used by orders, support, campaigns, and billing facts. One definition of "customer" prevents conflicting counts across teams.
Shared by sales, inventory, returns. Conformed attributes like category, brand, and department enable cross-fact analysis.
"Without conformed dimensions, you can't drill across fact tables. Marketing says '10K customers,' Sales says '12K.' The difference? Different definitions of 'customer.' Conformed dimensions fix this."
Interviewers may ask how Data Vault compares to Kimball star schemas. Know the three building blocks:
Stores unique business keys (customer_id, product_id). Immutable — once loaded, never changes. The anchor entity.
Captures relationships between hubs. Example: "Customer purchased Product." Stores only foreign keys to hubs. No business attributes.
Holds descriptive attributes with full history (load date, record source). All changes are appended, never overwritten. Enables time-travel queries.
"Data Vault is for the integration layer — it captures everything with full history. Star schemas are for the presentation layer — optimized for consumption. Many architectures use Data Vault underneath with star schemas on top."
BI tools query directly. Ad-hoc analytics. Performance matters. Analysts need simple self-service. Most common choice for presentation layer.
Dimension hierarchies change frequently. Storage is very constrained. You need strict normalization for audit. Some cloud warehouses optimize snowflake joins well.
Many source systems feed one warehouse. Requirements change frequently. Full history and auditability required. Large enterprise with evolving schemas.
"In practice, star schema is the default for 80% of warehouse use cases. Snowflake is rare in modern columnar warehouses because the storage savings are negligible. Data Vault is an integration pattern, not a replacement for dimensional modeling."