Hi all, new meltano user here. What’s the best app...
# best-practices
j
Hi all, new meltano user here. What’s the best approach to optimize
meltano install
to cache taps & targets? I’m containerizing Meltano with the default image and this is the step which always takes the longest and it wasn’t immediately clear how to optimize.
v
Creating a container doesn't need to be crazy fast right? 1-2 min or so is fine, do you need it to be faster?
j
In short: Yes. Nominally at first glance it’s not a huge difference but it adds up quickly.
v
Can you share you use case? I'm probably thinking about a different use case
j
So anytime I run a
docker build
and the
meltano install
step runs, it takes the same time and reinstalls all plugins from scratch, for example rebuilding dbt every single time. I would know how to avoid that with plain Docker by building a base image which holds all base requirements and using the base image in a downstream image with changing configuration. As
meltano install
abstracts this layer away from my Dockerfile, my use case is to replicate the above process in meltano to decrease build time as much as possible.
v
My point is why does that matter, but I'll leave it to others for recommendations for reducing time.
log-level=debug
may help point to the commands that are running that you could look at ways to reduce the time
t
@jan_kyri once you’ve containerized meltano then you shouldn’t need to rerun meltano install. We build the full container here https://gitlab.com/meltano/squared/-/blob/master/deploy/meltano/Dockerfile.meltano and then just use that as is from the gitlab container registry
j
Hi @taylor, that is quite clear to me. Staying in your example though, building this Dockerfile builds all
meltano install
dependencies from scratch. In our case, this includes dbt core, for example. This is a lot of idle waiting for dbt core to compile for the nth time although no transformer config was actually changed. Can these packages be cached indipendent of
meltano.yml
contents? I guess is there another way to inject them in the Dockerfile other than
meltano install
?
t
waiting for dbt core to compile for the nth time although no transformer config was actually changed
I guess I don’t understand the workflow here. Why are you having to compile it so much?
j
Because content’s of the
meltano.yml
changed, hence running another docker build command, and in doing so meltano install installs the dependencies seemingly from scratch.
t
oh I see, sorry. I’m not sure how to pull from a cache for parts of meltano install. It might warrant a new issue. <!subteam^S02BCD9FFEF> any thoughts on this?
e
@jan_kyri I wonder if you could use something like what's described in https://stackoverflow.com/a/58021389/5535114 to cache
.meltano/
a
@jan_kyri - I've logged some options, including the one noted by @edgar_ramirez_mondragon to this new issue: Reduce build times in CI with pip, docker, or venv-level caching (#3434) · Issues · Meltano / Meltano · GitLab Would love to hear your thoughts and track what works / doesn't work for future users to reference as well.