We are using AWS managed airflow(MWAA). In require...
# troubleshooting
k
We are using AWS managed airflow(MWAA). In requirements.txt, when I include Meltano, singer sdk. This is causing multiple conflicts with airflow's packages. We mostly use PythonOperator's and a custom Meltano Operator which just triggers
meltano elt
. What's the suggested way to use python environments on airflow? I looked at pythonvirtualenv operator.. but, that creates and destroys env every time. Which does not seem to be optimal. Is there any other way to have a separate env with Meltano that I can use for specific DAG's? Sorry if this question does not belong to this channel.
w
Hi @kk. If you are using Meltano normally, you should not need a dependency on anything other than Meltano in the Python environment that calls it. Meltano manages a virtual environment for each of its plugins, which is where the SDK may be installed. Does that help, or is there some reason in particular you're installing additional packages into Meltano's virtual environment?
k
We have several other DAGs that don’t use meltano. These have several package dependencies that we include in requirements.txt. So, I want to separate meltano and other packages install them into separate env if possible and use them per DAG.. if possible. Or anything of that sort will be good
p
@kk check out https://meltano.com/blog/deploying-meltano-for-meltano/ that @ken_payne wrote related to how we deploy Meltano in our own infrastructure. The important piece is around the decision to use containers and an airflow operator that runs containers (kubernetes operator, ECS operator, docker operator, etc.). We chose to avoid installing anything task dependent into the worker nodes. This lets us decouple dependencies. I havent personally used MWAA but from looking at the docs it wants you to pass in all dependencies in a single requirements.txt so I'd try to put as little as possible in there to avoid dependency conflicts. I dont see why the virtualenv route wouldnt work though, have the task create the meltano virtualenv prior to running, maybe theres a way to cache that and only create one if it doesnt already exist for long running workers 🤷
k
Even if you just include
meltano==2.15.1
in requirements.txt of https://github.com/aws/aws-mwaa-local-runner and test it(
./mwaa-local-env test-requirements
) it fails.
@pat_nadolny, I have already read the post. We are kind of all in into AWS MWAA as of now. But, I think, if we want to use meltano as of now.. only way is to use virtualenv operator. Thank you so much
p
What failures are you getting specifically? I wonder what base dependencies get installed prior to your custom requirements.txt thats causing the conflict
k
This is the error message
Copy code
The conflict is caused by:
    meltano 2.15.1 depends on werkzeug<=2.1.3 and >=2.1
    The user requested (constraint) werkzeug==2.2.2
I did try to edit these to loosen the versions.. but, its been a lot of work and yet, I keep getting one or other conflict