What s the best practice for setup reproducibility e g the ` Meltano #getting-started

What's the best practice for setup reproducibility...

simon_podhajsky

10/29/2022, 11:13 PM

What's the best practice for setup reproducibility (e.g. the

meltano add ...

bits?) Do you just make a Makefile and put up a phony

setup

target for them that you instruct people to run? Or is there some configuration-centric way that allows Meltano to surmise that it needs whatever taps are involved in the invoked steps?

simon_podhajsky

10/29/2022, 11:19 PM

(This is primarily for local runs of Meltano - I imagine for containerized deployment, you'd bake it into the image?)

alexander_butler

10/30/2022, 3:53 PM

Meltano should be in a git repo When you run meltano add, it updates your yaml files and lock files Just commit your changes Anyone else can clone the repo and run meltano install

Sven Balnojan

11/01/2022, 12:11 PM

adding my two cents: 1. To enforce this, you can add a CI test running meltano install inside your favourite meltano docker container. 2. This will also help to "in code" fix the meltano & python versions used to run your data dev environment. I usually try to go all in and only use dockerized meltanos (e.g. using a docker wrapper called batect as in here: https://github.com/sbalnojan/meltano-example-elt/blob/main/batect.yml). Using both strategies you should be able to keep all the data team on the right meltano version always working and quickly being able to tell when a new meltano(python version might break due to an out of date tap/target.

Open in Slack

Previous Next