Hello community For those who have mastered target snowflake Meltano #plugins-general

Hello community! For those who have mastered targ...

luis_henrique_barbosa_da_silva

07/11/2023, 1:40 PM

Hello community! For those who have mastered target-snowflake: How do I drop a table (in the Meltano pipeline) before pushing a new one? I want to do this every time I use the target: drop in the one that is present and upload a new one with the day's data. It can be any variant of target-snowflake.

pat_nadolny

07/11/2023, 1:55 PM

@luis_henrique_barbosa_da_silva I'm actually not aware of any variants that will do this for you yet. I created an issue in the SDK repo to support different sync modes, one of which could be trunc and load like you're looking for. Theres a variety of ways that people achieve this today with a bit of work. Usually this is done by using the

add_metadata_columns

setting which adds sync timestamps to the rows, then transforming with dbt to stage only the latest set of data.

pat_nadolny

07/11/2023, 1:56 PM

heres a similar thread with some helpful comments https://meltano.slack.com/archives/C013EKWA2Q1/p1663363416288579

luis_henrique_barbosa_da_silva

07/11/2023, 3:11 PM

Implementing the DBT Snowflake transformer didn't seem too easy for me to give a single "drop table" command. Am I mistaken? Could you suggest me the steps?

pat_nadolny

07/11/2023, 3:18 PM

The idea isnt so much that you'd use dbt to run a drop table command but rather to create a view off your raw target data that only shows the most recent data. So when you select from the view its as if you only have the latest data. Usually most people are already using the pattern of writing raw tap/target to snowflake then staging it by deduplicating/renaming/obfuscating/etc. with dbt (or any SQL transformation tool) before exposing it to downstream consumers of that data

luis_henrique_barbosa_da_silva

07/11/2023, 3:45 PM

Here we use the dbt after the EL for desired drops. The problem is that I have the same case as Nicholas's (link suggested by you above) and the solution to solve it was to use the dbt as a transformer in the Meltano. That's why I commented above.

luis_henrique_barbosa_da_silva

07/11/2023, 6:16 PM

@felipe_souto_campelo Look here

alexander_butler

07/13/2023, 6:31 PM

@luis_henrique_barbosa_da_silva re: use dbt as a transform to run drop. Thats non atomic so if your loader fails, you have a critical breakdown... I solve for this in target-bigquery via a config called

overwrite

that supports pattern matching stream names with

fnmatch

. I also have a config called

upsert

. Same concept using

primary_keys

sent with schema message. Default is append. If you are going to use dbt to run drop, you should write your table to a staging table. Then within a single transaction, execute

create or replace table ... as select * from stage.table

or whatever equivalent ddl

pat_nadolny

08/16/2023, 1:58 PM

FYI theres a new feature that is part of the SDK v0.31.0 now that adds a metadata column thats static for the entire sync. This solves the problem you might have run into with deduplicating using the

_sdc_extracted_at

column

pat_nadolny

08/16/2023, 1:59 PM

I also have a pending PR related to standardizing the load methods that targets have i.e. insert/upsert/overwrite https://github.com/meltano/sdk/pull/1893

pat_nadolny

08/16/2023, 1:59 PM

Feel free to leave comments!

Open in Slack

Previous Next