Pages

Sunday, March 15, 2015

Star Schema Vs Snowflake Schema




What Is a Star Schema?
A star schema model can be described as a simple star, a central table contains fact data and multiple tables spread out out from it, connected by the primary and foreign keys of the database. In a star schema implementation, Warehouse Builder stores the dimension data in a single table or view for all the dimension levels.



What Is a Snowflake Schema?
The snowflake schema represents a dimensional model which is also composed of a central fact table and a set of constituent dimension tables which are further normalized into sub-dimension tables. In a snowflake schema implementation, Warehouse Builder uses more than one table or view to store the dimension data. Separate database tables or views store data pertaining to each level in the dimension.

 Snowflake Schema vs. Star Schema
When choosing a database schema for a data warehouse, snowflake and star schemas tend to be popular choices. This comparison discusses suitability of star vs. snowflake schemas in different scenarios and their characteristics.

Comparison chart

Snowflake Schema
Star Schema
Ease of maintenance
No redundancy, so snowflake schemas are easier to maintain and change.
Has redundant data and hence less easy to maintain/change
Ease of Use
More complex queries and hence less easy to understand
Lower query complexity and easy to understand
Query Performance
More foreign keys and hence longer query execution time (slower)
Less number of foreign keys and hence shorter query execution time (faster)
Type of Data warehouse
Good to use for data warehouse core to simplify complex relationships (many to many)
Good for data marts with simple relationships (1 to 1 or 1 to many)
Joins
Higher number of Joins
Fewer Joins
Dimension table
A snowflake schema may have more than one dimension table for each dimension.
A star schema contains only single dimension table for each dimension.
When to use
When dimension table is relatively big in size, snowflaking is better as it reduces space.
When dimension table contains less number of rows, we can choose Star schema.
Normalization /
De-Normalization
Dimension Tables are in Normalized form but Fact Table is in De-Normalized form.
Both Dimension and Fact Tables are in De-Normalized form.
Data model
Bottom up approach
Top down approach


Consider an Example
Consider a database for a retailer that has many stores, with each store selling many products in many product categories and of various brands. A data warehouse or data mart for such a retailer would need to provide analysts the ability to run sales reports grouped by store, date (or month, quarter or year), or product category or brand.

Star Schema Example If this data mart was using a star schema, it would look as follows:

Diagram of STAR Schema

Diagram of STAR Schema

The fact table would be a record of sales transactions, while there are dimension tables for date, store and product. Dimension tables are each connected to the fact table via their primary key, which is a foreign key for the fact table. For example, instead of storing the actual transaction date in a row of the fact table, the date_id is stored. This date_id corresponds to a unique row in the Dim_Date table, and that row also stores other attributes of the date that are required for grouping in reports. e.g., day of the week, month, quarter of the year and so on. The data is de-normalized for easier reporting. Here is how one would get a report of number of televisions sold by brand and by country.
SQL Pattern of Star Schema
SQL Pattern of Star Schema
















Snowflake Schema Example
The same scenario can also use a snowflake schema, in which case it would be structured as follows:

Diagram of Snowflake Schema
Diagram of Snowflake Schema





The main difference, when compared with the star schema, is that data in dimension tables is more normalized. For example, instead of storing month, quarter and day of the week in each row of the Dim_Date table, these are further broken out into their own dimension tables. Similarly for the Dim_Store table, the state and country are geographical attributes that are one step removed — instead of being stored in the Dim_Store table, they are now stored in a separate Dim_Geography table.
SQL Pattern of Snowflake Schema
SQL Pattern of Snowflake Schema



  

1 comment:

  1. Play Online at The Casino DrmCD
    Play Online at The Casino DrmCD for real 인천광역 출장안마 money. Discover top 평택 출장마사지 rated online 군산 출장마사지 casino games like Blackjack, Roulette, Slots, Video 통영 출장마사지 Poker, Video Poker, 용인 출장샵 Video

    ReplyDelete