Hey everyone, <!subteam^S02BCD9FFEF> nice job for ...
# infra-deployment
m
Hey everyone, <!subteam^S02BCD9FFEF> nice job for the Dagtser utility. Locally it works nicely, but I tried setting it up in Prod with something closely to your Cubed deployment. Meening 2 DB pods (Dasgter and Meltano), 1 Pod for Dagster and 1 for my Meltano custome code. Unfortunately, I don't get how to connect the dagster extension from meltano to dagster, as the 2 things are running in different K8s Pods. There is also no way to share the Volumes in K8s. Does anyone has some idea how to achieve this? Otherwise I will have to get back to Airflow ๐Ÿ˜• @ken_payne?
j
Hi Michel, I am the creator of the dagster utility. Not entirely sure how you have setup your k8s deployment. But we use it in production as follows: We used the Dagster helm chart to get a production Dagster environment running. Afterwards we create a single user deployment container with all the Meltano code. Here is an example of the Dockerfile, You might be able to use it for inspiration:
Copy code
FROM python:3.10-slim

ENV MELTANO_PROJECT_ROOT=/opt/dagster/app
ENV PATH="/root/.local/bin:${PATH}"

WORKDIR /opt/dagster/app

RUN apt-get update && \
    apt-get install -y build-essential git nano && \
    rm -rf /var/cache/apt/archives /var/lib/apt/lists/*

RUN pip install pipx==1.1.0

RUN pipx install meltano[azure]==2.12.0

RUN pip install --no-cache-dir --upgrade \
    dagster-postgres==0.16.* \
    dagster-docker==0.16.* \
    dagster-k8s==0.16.* \
    dagster-ext

COPY ./meltano.yml ./meltano.yml

RUN meltano install loaders

RUN meltano install utilities

COPY ./taps ./taps

RUN meltano install extractors

COPY . .

CMD ["dagster", "api", "grpc", "-h", "0.0.0.0", "-p", "4000", "-f", "orchestrate/dagster/repository.py"]
m
Hi @jules_huisman, That is pretty much what I am trying to achieve!! Thanks for dockerfile! I will test is tomorrow and may come back to you with some questions. If I got it correctly, the one part doing the connection between meltano and dragster is the last command?! Thanks ๐Ÿ™
j
It works a bit different, Meltano is not really a running service. It is just the command line application. It gets installed using the
pipx install meltano
. The
dagster-ext
does the translation from your Meltano jobs to the Dagster jobs. This Dockerfile represents one of the
User Code Deployments
in the following image: https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm#deployment-architecture
m
Hi @jules_huisman Thanks for the additional information. If I got it correctly this time, I can use meltano to generate the code and set up the gRPC server for Dagster to get the necessary code, right? In the dagster deployment I allready have a custome dagster-user-deployment. So I just have to make another dagster-user-deployment using like deployment for the meltano code, set up the service and add it to the dagster helm chart under "dagster-user-deployments", right?
j
Yes, correct! The Dockerfile I sent before is our Dagster user deployment. And you have to add it to your Dagster helm chart.
m
Dagster user deployment for your meltano code, right? Or do you create 2 docker images for your meltano environment? Sorry for the many question, just tryng to finally have a stable solution ^^
j
No problem, yes correct one user deployment for you Meltano container.
m
Thanks @jules_huisman!!! thankyou Got it!! Just some small things that may need to be adjusted: โ€ข The command you used in the Dockerfile, shouldn't it be
meltano invoke dagster api grpc -h 0.0.0.0 -p 4000 -f orchestrate/dagster/repository.py
(with meltano invoke before) โ€ข The readme in the repo may need to be adjusted. In the installation, you use "dagster-ext" but I think it is just
meltano add utility dagster
j
Great! You could invoke dagster using Meltano, but this is another level of inception that is not necessary ๐Ÿ˜› . You are only using the
dagster_meltano
package, not the
dagster-ext
. This will be more clear in the future when I separate these two concepts. Completely right on the second point, I adjusted it.
m
I will try to make up some graphic to show my current setup to the team. I will share it with you, there may be some "overkills", especially for the first point. Maybe you will be able to use it for the doc as well to make it clear for other users! But not before 2023 ๐Ÿ˜› Have some great holidays! ๐ŸŽ„ ๐ŸŽ†
j
Yeah, cool. I think it would be nice to have a clear guide on how to take your local Dagster version to a production one. Good luck with the deployment and happy holidays!
m
Hey @jules_huisman I come back with another question ^^ When defining the jobs manually in the repository (first screenshot), Dagster gets the jobs. But when using the method
load_jobs_from_meltano_project
(second screenshot) I don't get anything even though the schedules are defined (third screenshot). Do I miss something in the schedules?
j
Ah, you need to use the job notation. It doesnโ€™t work with the elt notation.
I will make this more clear in the documentation
m
What do you mean? I need to do it like in the first screenshot?
j
You need to create jobs in your Meltano project. And reference those in your schedules.
Sorry, I wasn't at my computer, but here is an example project: https://github.com/quantile-development/dagster-meltano/blob/master/meltano_project/meltano.yml
m
Moving on, step by step. Thanks @jules_huisman I can load the repo now with jobs. But when running the jobs (I tried it with your example job), I get following error in Dagster which I do not yet understand. Did you encounter this (screenshot 1)? If I understand it correctly, dagster does not understand which image it should use to run the container with the meltano user-deployment. Therefore I tried adding a second deployment section in the dagster values yaml (screenshot 2). But I still get this error... Somehow I miss somthing and it drives me crazy ๐Ÿคฏ