Do people notice much difference using target DBs that are v Meltano #troubleshooting

Do people notice much difference using target DBs ...

Andy Carter

04/04/2023, 1:57 PM

Do people notice much difference using target DBs that are very PK/OLTP oriented (postgres, mssql) vs something more analytic oriented (bigquery/snowflake) where PK is more relaxed? Quite a few taps I have found issues with duplicate primary keys that prevent upsert with MSSQL, but would probably work fine with biqquery etc. To resolve, I am specifying

key-properties

quite often, or in some cases forking repo and modify the fields designated as PK to get to green. I'm sure these issues are just due to the original tap devs not having access to a wider range of targets to test on; if I was developing a new tap I'd probably just get it working on my production target of choice and move on. I am coming to the realisation that testing with a fairly liberal target like

duckdb

and getting it working doesn't necessarily mean that

target-mssql

that will work, you have to test/iterate the exact tap and target combo to be 100% sure.

jan_soubusta

04/04/2023, 2:15 PM

Interesting. I always test against PostgreSQL locally, and it proofs if the referential integrity is OK. Only then, in a upstream pipeline, I deliver to Snowflake/BQ/Redshift/.... Definitely no need to test against more OLTP DBs. PostgreSQL is easiest to bootstrap (at least for me, running in docker).

Andy Carter

04/04/2023, 2:17 PM

You're in the reverse situation to me by the sound of it 🙂

jan_soubusta

04/04/2023, 2:21 PM

Well, not sure. What I wanted to say is that if I would be a developer of a tap, I would test it against PostgreSQL target first and then against OLAP DB, ideally a one with a free docker image, e.g. Vertica or Greenplum. If all tap owners would do it this way, your life is easier, am I right?

Andy Carter

04/04/2023, 2:27 PM

Yes, that would help a great deal! I think maybe there is such a vast array of targets (and growing daily) that expecting perfect integration with all of them is too much. And for the majority of people the tap 'just works' with bigquery etc. I think I was excited by the idea of being to swap out targets pretty seamlessly, and it's not quite as easy as I imagined it would be. Don't get me wrong, its still a vast improvement over the alternative!

Open in Slack

Previous Next