Can I write two meltano streams to the same target I have a Meltano #getting-started

Can I write two meltano streams to the same target...

Matt Menzenski

02/23/2024, 12:30 PM

Can I write two meltano streams to the same target? I have a tap that I run two instances of, one log-based replication mode and one incremental replication mode. The schema is the same in both modes. Are there any reasons why I can’t (or shouldn’t) have both of those streams written to the same table in my target?

haleemur_ali

02/23/2024, 4:38 PM

not trying to detract here. I don't think we have enough context yet I'm curious how you would intend to manage state & duplication for both of these I'm also curious what issue you're trying to solve for here.

peter_s

02/23/2024, 9:51 PM

I do this. I have two streams which are convenient to have in the same target table. I have a separate task for each of them, each task having a different tap but the same target. Meltano maintains state on a per-task basis, so state is properly managed in this setup. In my case there’s no duplication problem to worry about, because the streams are distinct data sources. But I agree that it sounds like duplication could be an issue in your case.

👍 1

Matt Menzenski

02/23/2024, 11:40 PM

I’m already going to have “duplicate” records from the log-based tap, so the consumers of my table are already getting the latest version of each record.

Matt Menzenski

02/24/2024, 12:04 AM

I'm also curious what issue you're trying to solve for here.

mostly it is “make downstream dbt models much less complicated” - today they need to Union these two tables

peter_s

02/24/2024, 1:10 AM

Same here

haleemur_ali

02/24/2024, 2:46 AM

I can see how this benefits your use-case, but it still feels like an antipattern that leaves you open to a pipeline failure. what i'm struggling to get on board with is that you have two separate source feeding into the same schema. granted that my opinion is colored by my own experience and not necessarily be generalized as a point against your approach it might happen that the schema for one gets updated, while the other does not, or as happened to me, the schema for 1 data source gets rolled back. In my example (an internal legacy etl system not written in meltano), there are 4 supposedly identical databases that first get united before further transformation. this union is done mostly via dbt macros, but importantly, the fields are statically set.

peter_s

02/24/2024, 4:33 AM

Another potential downside is that it can be harder to write dbt tests if they need to have differing logic for the two sources.

💯 2

10 Views

Open in Slack

Previous Next