Hello. How can i do full refresh with target-post...
# troubleshooting
f
Hello. How can i do full refresh with target-postgres? I need to truncate the table before inserting data. Has it a parameter to do that? If not, is there another way to truncate the table? Thanks.
v
If your Tap supports activate version then run the tap in Full table mode and you're done (with the Meltanolabs version I know this is supported) Otherwise I point folks to dbt, there's some examples out here somewhere I'll have to search slack a bit
f
@visch I already tried with meltanolabs version and it did not works for me. The full table mode only updates the existing data but the other records that are no longer in the source and should be deleted from Postgres remain.
v
What tap are you using?
f
@visch tap-spreadsheets-anywhere
a
@felipe_souto_campelo I have a forked variant of spreadsheets anywhere, and found this old PR: https://github.com/quickbi/tap-spreadsheets-anywhere/pull/2/files I'll try to adapt
I can already see an issue if you are using multiple csv files to one table/pattern matching via regex, the
modified_since
part will not work correctly, it will assume only files modified recently will get synced, and the new version number applied.
Basically with full_table mode the 'state' functionality becomes redundant.
@felipe_souto_campelo try this branch: https://github.com/acarter24/tap-spreadsheets-anywhere/tree/activate_version In my table config I'm setting
full_table_replace: true
and then using
hard_delete:true
for meltanolabs variant of
target-postgres
Seems to be working nicely, but need to do a bit more testing on my end.
f
@Andy Carter Awesome! It worked for me. But, unfortunately, now i need to make this run for target-snowflake. The destination database changed in my project. I tried with target-snowflake but it did not works. Is it possible to make a general modification in tap-spreadsheets-anywhere to do the correct full refresh with any target? Thanks for your help.
a
Hi Felipe, the activate version logic needs to be in both the tap and target to work correctly. With this change, the tap now has all the right logic. There would need to be a change to
target-snowflake
to support the
activate_version
message.
Could you handle the deduplication a stage downstream in dbt or similar? If you're pulling in the full table now with your tap, and have the
_sdc
columns, then do an append from your raw data, and use those
sdc_batched
values to identify the last complete run.