<@U06CQ9TKY5P> Thanks for bringing this up; the Do...
# docker
d
@al_whatmough Thanks for bringing this up; the Docker image is definitely larger than it ideally would be. The
meltano/meltano
image (https://gitlab.com/meltano/meltano/-/blob/master/docker/prod/Dockerfile) is based on
meltano/meltano/base
(https://gitlab.com/meltano/meltano/-/blob/master/docker/base/Dockerfile), which in turn is based on
python:3.6
, which weighs in at a whopping 337MB: https://hub.docker.com/layers/python/library/python/3.6/images/sha256-40aa5d1d9e758e8287806d7238667883b047eaa779e137b491d3e0f0fa1ef064?context=explore Per https://hub.docker.com/layers/meltano/meltano/latest/images/sha256-665e5d3eca98c6faec3af5332ef9de763214e28b4394645ff21d8e9856634501?context=explore, the next 118MB are coming from Meltano's own
requirements.txt
. We could see which of those requirements take up most space and whether we actually need them, but obviously the most significant impact would have to come from using something other than
python:3.6
. https://blog.realkinetic.com/building-minimal-docker-containers-for-python-applications-37d0272c52f3 suggests using the much smaller Alpine images instead, but https://pythonspeed.com/articles/alpine-docker-python/ discusses some downsides, and https://pythonspeed.com/articles/base-image-python-docker-images/ suggests using the Slim images instead. I think that'd certainly be worth a try 🙂 Would you like to create an issue to explore this some more, or shall I?
a
Thanks for the detailed reply @douwe_maan! I'm happy to explore it some more, once I've done that and have some more info I can create an issue
d
@al_whatmough That'd be great, thanks! In the mean time, @paul_blankley has also been working on improvements to the
Dockerfile
in the
docker
file bundle: https://gitlab.com/meltano/files-docker
p
FWIW, we use the
python-3.6
image to build all of our wheels (because it has gcc and other convenient things) then install those wheels in
python-3.6-slim
, which greatly reduced our container size
a
Sounds like a good approach @paul_blankley. So do you built Meltano as a Python wheel? Or are you building its dependencies as separate wheels and then installing these as part of your Docker build?
p
@al_whatmough this was the strategy on some different builds, I haven’t used it with meltano yet
Copy code
FROM python:3.6 as base

COPY ./requirements.txt /
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt

RUN pip install $(cat requirements.txt | grep meltano)
COPY ./meltano.yml /
RUN meltano install

# base image
FROM python:3.6-slim

COPY --from=base /wheels /wheels
COPY --from=base requirements.txt .
COPY --from=base .meltano/ .

RUN pip install --no-cache /wheels/*

# set working directory
WORKDIR /project

# add app
COPY . /project

# Pin `discovery.yml` manifest by copying cached version to project root
RUN cp -n .meltano/cache/discovery.yml . 2>/dev/null || :

# Don't allow changes to containerized project files
ENV MELTANO_PROJECT_READONLY 1

# Expose default port used by `meltano ui`
EXPOSE 5000

ENTRYPOINT ["meltano"]
this cuts the image size in half (1.88gb -> 965mb, for me)
actually, upon further testing this messes with the venv’s the plugins are installed into, so I went back to the meltano base image for now