I've found some old threads in the Slack space ref...
# troubleshooting
s
I've found some old threads in the Slack space referring to
load_method: overwrite
for the
target-postgres
SDK and it appears that it's ineffective. I wrongly assumed that overwrite was happening by default, and I end up with duplicate data over the course of runs, because the raw tables aren't being dropped. I've written some macros in dbt that will drop the tables before the extract and load, but I am thinking this is going to cause issues with
tap-spreadsheets-annywhere
and its state tracking. For example, if I drop the tables before the tap, and the state hasn't changed for the files, I end up with no raw table. Is it possible to extract, drop the tables, and then do the load or would the stream be lost. Essentially, do the tap and the target have to be run sequentially? This is what I currently have, but the raw tables will always be dropped, even if the state hasn't changed for
tap-spreadsheets-anywhere
.
Copy code
tasks:
  - dbt-postgres:drop-raw-tables
  - tap-spreadsheets-anywhere
  - target-postgres
  - dbt-postgres:select-ndo
This would be dropping the tables after extraction, but not sure this is advised our would even work considering the stream may be lost:
Copy code
- tap-spreadsheets-anywhere
- dbt-postgres:drop-raw-tables
- target-postgres
- dbt-postgres:select-ndo
Any ideas on the best solution here?
v
Have you tried it yet? (You have to run tap/targets sequentially, tap then target, that second setup can't run with meltano run)
s
I haven't tried it, no, but that's what I figured based on the docs I have read.