{"id":2029,"date":"2025-08-18T02:31:00","date_gmt":"2025-08-18T02:31:00","guid":{"rendered":"https:\/\/www.cmarix.com\/qanda\/?p=2029"},"modified":"2026-02-05T11:59:44","modified_gmt":"2026-02-05T11:59:44","slug":"handling-slowly-changing-dimensions-scd-in-etl","status":"publish","type":"post","link":"https:\/\/www.cmarix.com\/qanda\/handling-slowly-changing-dimensions-scd-in-etl\/","title":{"rendered":"How Do You Handle Slowly Changing Dimensions (SCD) in ETL?"},"content":{"rendered":"\n<p>Slowly changing dimension is a concept in data warehousing, where the attribute values in a dimension table change over-time but gradually. A real-world example of this can be how a customer could want to change their address, or an employee might need to switch departments.<\/p>\n\n\n\n<p>In such situations, you need to assess:<\/p>\n\n\n\n<p><strong>Do I overwrite the old data, or preserve history?<\/strong><\/p>\n\n\n\n<p>Handling SCDs correctly is critical for accurate historical reporting, trend analysis, and regulatory compliance.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Is a Dimension Table?<\/h2>\n\n\n\n<p>A <strong>dimension table<\/strong> stores descriptive attributes (e.g., customer name, product category) that relate to <strong>facts<\/strong> in a fact table (e.g., sales, transactions).<\/p>\n\n\n\n<p>Slowly changing dimensions require a strategy to manage data that <strong>changes gradually<\/strong> but must be tracked.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Types of Slowly Changing Dimensions (SCD)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Type<\/strong><\/td><td><strong>Description<\/strong><\/td><td><strong>Use Case Example<\/strong><\/td><\/tr><tr><td><strong>Type 0<\/strong><\/td><td>No change is allowed (read-only)<\/td><td>Immutable attributes like Date of Birth<\/td><\/tr><tr><td><strong>Type 1<\/strong><\/td><td>Overwrite the old value<\/td><td>Fixing a typo in the product name<\/td><\/tr><tr><td><strong>Type 2<\/strong><\/td><td>Keep history by inserting a new row<\/td><td>Tracking customer address over time<\/td><\/tr><tr><td><strong>Type 3<\/strong><\/td><td>Keep limited history in the same row (e.g., old &amp; new)<\/td><td>Keep last and current region<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Implementing SCD Type 2 \u2013 Most Common Approach<\/h3>\n\n\n\n<p>Type 2 is the most frequently used for <strong>auditability and historical accuracy<\/strong>.<\/p>\n\n\n\n<p>You maintain multiple versions of the same entity by adding:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A <strong>surrogate key<\/strong> (unique ID for each version)<\/li>\n\n\n\n<li><strong>Valid from<\/strong> and <strong>valid to<\/strong> timestamps<\/li>\n\n\n\n<li>A <strong>current flag<\/strong> (true\/false)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example: SCD Type 2 in SQL<\/h3>\n\n\n\n<p>Let\u2019s say we\u2019re tracking changes in an employee\u2019s department.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Step 1: Check if any changes\nSELECT * FROM employee_dim\nWHERE employee_id = 123 AND current_flag = TRUE;\n\n-- Step 2: If changed, mark old row as inactive\nUPDATE employee_dim\nSET current_flag = FALSE,\n    valid_to = CURRENT_DATE\nWHERE employee_id = 123 AND current_flag = TRUE;\n\n-- Step 3: Insert new version of the row\nINSERT INTO employee_dim (\n    employee_id, name, department, valid_from, valid_to, current_flag\n)\nVALUES (\n    123, 'John Doe', 'Marketing',\n    CURRENT_DATE, NULL, TRUE\n);<\/code><\/pre>\n\n\n\n<p>This preserves the full history of department changes over time.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">SCD Handling in ETL Tools<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Tool<\/strong><\/td><td><strong>Feature<\/strong><\/td><\/tr><tr><td><strong>dbt<\/strong><\/td><td>dbt_scd macros (custom or via packages)<\/td><\/tr><tr><td><strong>Informatica<\/strong><\/td><td>Out-of-the-box SCD management<\/td><\/tr><tr><td><strong>Talend<\/strong><\/td><td>Built-in SCD components<\/td><\/tr><tr><td><strong>Airflow + SQL<\/strong><\/td><td>Fully manual implementation<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Many enterprise-grade tools like Informatica offer built-in support for Type 1 and Type 2 SCDs. These features become even more effective when paired with platforms like <a href=\"https:\/\/www.cmarix.com\/blog\/key-benefits-of-google-bigquery-to-manage-serverless-data-warehousing\/\">Google BigQuery<\/a>, a leading solution for serverless data warehousing and scalable ETL workflows.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Final Takeaway<\/h2>\n\n\n\n<p><strong>Handling slowly changing dimensions is essential when:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You need historical tracking<\/li>\n\n\n\n<li>You build dashboards comparing current vs. past attributes<\/li>\n\n\n\n<li>Your business must comply with audit requirements<\/li>\n<\/ul>\n\n\n\n<p>Use <strong>Type 2 SCD<\/strong> to maintain history. Use <strong>Type 1<\/strong> for simple corrections. And always design your ETL pipelines with the appropriate SCD strategy based on the <strong>business need<\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Slowly changing dimension is a concept in data warehousing, where the attribute values in a dimension table change over-time but gradually. A real-world example of this can be how a customer could want to change their address, or an employee might need to switch departments. In such situations, you need to assess: Do I overwrite [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":2056,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[157,162],"tags":[],"class_list":["post-2029","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-engineering","category-etl"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/posts\/2029","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/comments?post=2029"}],"version-history":[{"count":2,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/posts\/2029\/revisions"}],"predecessor-version":[{"id":2031,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/posts\/2029\/revisions\/2031"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/media\/2056"}],"wp:attachment":[{"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/media?parent=2029"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/categories?post=2029"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cmarix.com\/qanda\/wp-json\/wp\/v2\/tags?post=2029"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}