hey folks i m wondering what the community feels about this Meltano #best-practices

hey folks, i'm wondering what the community feels...

haleemur_ali

04/30/2024, 8:43 PM

hey folks, i'm wondering what the community feels about this scenario: • tap 1 is refreshed every day at midnight • tap 2 is refreshed every 4 hours • there are some downstream transformations depending on both tap 1 & tap 2's tables. when / how often do you trigger downstream transformations that depend on data from both taps? what questions do you ask (checks do you run) prior to triggering the downstream transformation that uses both tap 1 & tap 2? I'm expecting this question may generate some opinionated answers, and I definitely welcome any experience / anecdote you would be comfortable sharing ❤️ thanks in advance!

aaron_phethean

05/01/2024, 6:05 AM

How long does each tap and models take? If we assume tap1 is ok to run more often, I would consider the following: tap1 -> models tap2 -> tap1 -> models

haleemur_ali

05/01/2024, 4:11 PM

Thanks Aaron!

haleemur_ali

05/01/2024, 4:16 PM

I am wondering if anyone has build any strategies around data freshness. something like checking the freshness for tap 1 & tap 2, and then building the common models, and what the pros / cons of this approach might be.

aaron_phethean

05/01/2024, 5:37 PM

Isn’t this the use case for incremental dbt models? Alternatively using pipeline stages as the check and conditionally trigger would work?

Shubham Kawade

07/22/2024, 9:20 AM

You could use

source freshness

in dbt to check if the source has been updated (by meltano or something else) and then go ahead with your scheduled pipeline

3 Views

Open in Slack

Previous Next