Hi, is it possible to connect meltano to my airflo...
# getting-started
m
Hi, is it possible to connect meltano to my airflow instance? The problem is that my instance is managed by aws, I am using MWAA and I want to run my taps and targets in MWAA; but I don't know how to configure it.
m
@steve_clarke đź‘€
s
Hi @michael_lan , @mark_johnston , we run this configuration successfully. In short, we have created a Docker Image based on the Meltano Docker image, we have deployed in it our meltano.yml file for the systems we want to connect to and any other required connectivity requirements e.g. Oracle Thick Client. Don’t store any credentials or values against the meltano.yml settings in this container just the config. This docker image is deployed in ECR / ECS. To run Meltano we simply call the Airflow ECSOperator to invoke the image. We have chosen to use Fargate and a ECS task so this is only running the container for the lifetime of the ingestion. You can pass in the appropriate command when calling the ECSOperator to invoke the meltano cli with an appropriate tap / target. To get the settings for your taps and targets either set them in the environment settings when invoking tap / target or use a utility like chamber to obtain them at runtime from the AWS SSM Parameter store. You will also need to set an appropriate Meltano store to hold your state. Example RDS Postgress or Aurora in AWS. I hope this helps give you an overview of a possible architecture. There are most likely other approaches like invoking the meltano cli installed on an EC2 instance.
I should state I wouldn’t recommend trying to install Meltano into MWAA and running it there. Leave MWAA for Orchestration, the product is expensive enough without having a very large memory and cpu allocation to run ingestions.
a
Agree with the above, we made the mistake to use some injestion/transformation in mwaa and was very expensive. Obviously having to pump up the cpu and memory. Bad mistake never again. But we ended up using as above, exact same approach! Just be aware that the airflow logs are not super great by default in the ECSOperator but it is very nice, you can setup cloudwatch and look at the logs there :)