I have a question, I am using target-postgres to l...
# getting-started
s
I have a question, I am using target-postgres to load the data to a staging table in a postgresql database, and use dbt-postgres to transform the data in the staging table, and store the transformed data in a final table. since the data loading is incremental, the staging table will grow bigger and bigger. What is the best practice to clean up the staging table?
h
I think most people consider the table where data lands as the “raw” data, which is important to keep. dbt takes this philosophy to heart, and if you do a full rerun with refresh on, all tables except sources and snapshots will be replaced. This goes against some ways of thinking in the database world (I know, strong opinions and all that), but since dbt is quite opinionated I’d argue to go with the flow and just accept that the landing table will grow. If you use snapshots as the “core” of your warehouse and you want to continue deleting landing data, I think it makes sense to do it at the end of each run (if the run is successful). But I don’t think there is any built-in support for automatically truncating landing tables or anything.
p
Yeah the common argument is that storage is usually cheap and if you find a bug in your transformation or want to change it then you'll need to start from scratch and re-replicate the data. In many use cases transformations are easy to rerun vs the replication step thats a bigger lift.
s
got it, thanks.