Hey, I am building custom images for meltano, dbt,...
# docker
j
Hey, I am building custom images for meltano, dbt, and GoodData client. Meltano build takes way longer than the others. Also, the size of the image is at least 2x bigger than the others. Thinking about to create an optimized Dockerfile building base Meltano image (without plugins, ...). Could you point me to the Dockerfile you currently use to build images which you push to Dockerhub? Have you tried to use smaller base (OS) image?
e
Hey Jan!
Could you point me to the Dockerfile you currently use to build images which you push to Dockerhub?
File: https://github.com/meltano/meltano/blob/cab58123544c2e7f7b6550f5a601c749c062e9ad/docker/meltano/Dockerfile Workflow: https://github.com/meltano/meltano/blob/cab58123544c2e7f7b6550f5a601c749c062e9ad/.github/workflows/docker_publish.yml
Have you tried to use smaller base (OS) image?
Yup, but biggest problem is that Meltano needs to be able to install rather arbitrary packages at runtime, and we want to make the images as useful as possible to the most people, so they're shipped with a few build dependencies, e.g. for building mssql or postgres clients.
j
I see. What I am thinking about is to create an alternative slim image, which would help only subset of use cases but would be very valuable for them. I will be out of office next week, but then I will try to contribute here.
e
m
also very interested in this 👀
j
I designed tens of Dockefiles, it's often tricky. For instance, recently I found out that python slim image does not contain make command, so I cannot utilize my Makefile facepalm But, it must be feasible to build slim for Meltano. Look at my demo pipeline (screenshot). And it's not only about build but also about pull in execution jobs. You can cache docker layers, but not always - in my case I use free GitHub workers (open-source repo), and there such caching does not work, you don't have dedicated workers...