I was thinking as I put together what feels like m...
# random
a
I was thinking as I put together what feels like my 1000th staging dbt model yesterday: given that meltano kind of 'standardises' the dbt sources created from an tap (consistent table schema definitions), then the resulting DBT models (stg, int and mart) should be consistent too. So I could share my dbt models for my freshdesk tap, and you could implement them right off the shelf. I feel like this is something that has probably been done already though? Shareable sets of dbt models from staging to mart / dim and fct tables for a particular API?
g
probably along the lines of the kind of repos Matatika share for some taps? https://github.com/Matatika/dbt-tap-github https://github.com/Matatika/dbt-tap-googleads think Meltano have a few on their Github too, usually with the dbt-tap-* prefix, dbt packages in a repo with a naming convention 🤷
m
I was thinking about the idea of pairing a DBT package that includes staging tables designed for a specific tap, but I feel like there's a lot of choices that can happen in the staging layer (naming conventions, type conventions, column selection and masking, etc.) that make it hard to just settle on one universal pre-built staging model. There are still examples out there, like Fivetran offers DBT packages for some of their integrations. I was thinking instead of creating a utility that parses the catalog from a Singer tap and creates the skeleton staging model that could be then customized (or even just used as is).
t
heh - so we used to do this a few years ago. the
meltano elt
command in particular was setup so that it could automatically run the dbt package for you and populate the appropriate env vars etc. As Mark alludes to, the challenge is a lot of the incidental complexity. target choice, stream selection, quirks of your own upstream system, etc. that make the dbt model pretty fragile. Fivetran is able to do it b/c they control both ends of the pipeline pretty tightly, so the surface area for incidental complexity is minimal
a
Makes sense, knew someone would have already thought of it (if not implemented!). Nice to provide them with your tap as a good starting point though, brownie points to Matatika.
t
The Matatika folks are great 😄
a
Agree with everything said here 😂 After building quite a few of these, it feels like a good template pattern, and pretty good for delivering standard reports and analytics. Self service BI works ok on top of these. They can even be upgraded. Once the complexity grows I don’t feel like there’s a clever way to handle the schema changes that flow down. Custom / user defined fields for example. Not yet anyway.
e
Interesting topic, miss the
meltano ELT
glory days cc: @neil_gorman (some overlap w/ custom schema changes)
a
Another little step in this 'standard models' direction - documentation should be easy to publish too. https://meltano.slack.com/archives/C04632W0HT2/p1692625402027739