We feed data from various systems in our DW using ...
# getting-started
w
We feed data from various systems in our DW using taps. We then use dbt to combine it all and feed it into BI tools, Salesforce, etc. The "historical" data in the source systems can change over time (things get backdated) or attributes get changed causing something to be bucketed one way when we ran our 6/30 reporting but bucketed a different way when we ran our 7/31 reporting. I plan on using dbt snapshots to track some of the changes but we would also like to be able to compare our 7/31 reporting if ran today against our 7/31 reporting when we ran it on 8/1. I think a separate schema (e.g. snapshots) where we can copy the output of a view into (INSERT INTO snapshots.financial_data (col1, col2) SELECT FROM finance_mart.financial_data) would be what I am looking for here. What would be the best way to accomplish that? Am I thinking about the problem in the right way?
u
Snapshots sounds like a reasonable way to go. This might be more of a dbt question then. It sounds like a classic slowly changing dimension from your description