Has anyone taken a Meltano tap + target combinatio...
# plugins-general
m
Has anyone taken a Meltano tap + target combination and used it as a “long-lived application” ? I’m interested in doing this to have a more “real-time” stream of data from a source. I’m basically wondering if I could put a
while True:
loop around things. Anyone ever done anything like this?
r
I’ve yet to look at airflow and orchestration but it sounds like you’d want to schedule it to run every so often. Would that be sufficient for the data source you have in mind? Or does it need to be live?
m
Yeah, running it more frequently in our existing orchestrator (Argo Workflows in Kubernetes) is certainly an option, but I think that’d limit us to “every five minutes” or so (due to. container startup time, tap startup time, etc)
Wondering if there’s any established convention for using Meltano to get more of a “real-time streaming” process
d
I’ve tested a relatively long running approach with LOG_BASED replication from postgres and mongodb. In my case Meltano run for couple minutes replicating (or waiting for a new) data from sources and then just stops to propagate data downstream and re-run again. Since the downstream pipeline still works in batches, the approach just helps collecting a new batch while the downstream is working with the previous one. The trickiest part is to stop Meltano when the previous last is done.
m
Interested. Sharing another small thread for context - https://meltano.slack.com/archives/C069CPBCVPV/p1705395655510039