Is anyone running into issues where the meltano py...
# infra-deployment
m
Is anyone running into issues where the meltano python wrapper for airflow is injecting logs into the airflow.cfg file and corrupting it, causing the airflow containers to repeatedly crash? This is occurring on a docker local deployment. I ended up having to just deploy locally without the meltano python wrapper with the vanilla airflow commands. Are there any advantages to running airflow using the meltano python wrapper? If not, I would probably end up deploying in prod with just vanilla airflow. We're deploying in ECS, for reference.
v
so you already are using airflow and want to run airflow with it? If so then yeah just a meltano python wrapper will get you where you need to go
m
Sorry just to clarify, I was using the meltano python wrapper to run the airflow containers with meltano, but the wrapper itself was causing issues by injecting logs into the airflow.cfg file, corrupting the file and causing the airflow containers to continuously crash. Given this problem, I wanted to ask if people typically use the meltano python wrapper in prod to run airflow, or do they just run airflow without the meltano python wrapper.
v
I was using the meltano python wrapper to run the airflow containers with meltano, but the wrapper itself was causing issues by injecting logs into the airflow.cfg file, corrupting the file and causing the airflow containers to continuously crash.
I don't think that's what happens to most folks, so sharing the issue you're running into can be helpful
Given this problem, I wanted to ask if people typically use the meltano python wrapper in prod to run airflow, or do they just run airflow without the meltano python wrapper.
Run it how you want, Meltano is pretty easy to orchestrate yourself so that's what I do personally. I tried to count the other day how many orchestration systems I've setup Meltano on. Windows Task Scheduler, Crontab, SQL Agent Job's, Docker Containers ran via you name the place, Github Actions, Gitlab Actions, Prefect, Dagster, Amazon ECS, Google Cloud Run, Airflow, Snowflake containers Given all that experiance I actually recommend folks just spin up a VM, and schedule the stuff themselves with cron at first, unless their company already uses an orchestrator. It's so easy to run it's really not even worth an orchestrator so if your team uses gitlab or github, I'd just use gitlab / github CI/Actions. Anyway you're allowed to complicate it however you want to is the mode I operate on now!
👍 1
m
@visch Thanks for your response. The error I'm getting is here:
Copy code
configparser.DuplicateSectionError: While reading from '/project/orchestrate/airflow/airflow.cfg' [line  5]: section '2025-09-23T23:21:54.954+0000' already exists cmd=airflow db init stdio_stream=stderr
So you can see that when airflow parses the airflow cfg file, it thinks that the log datetime stamp
2025-09-23T23:21:54.954+0000
is a header in the cfg file. It's a duplicate section error because multiple logs of the same datetime stamp are being injected into the cfg file. It's not clear to me why this is occurring as I don't ever recall setting a parameter that directs meltano to write out the logs to the config file. I've attached my Dockerfile and docker compose file, and the relevant sections of my meltano.yml file are below. The docker-compose file is currently set to run vanilla airflow without the meltano wrapper to bypass the logs issue, but the original code to run the python wrapper is still there, just commented out. Noted on your other recs!
Copy code
files:
    - name: files-airflow
      variant: meltano
      pip_url: git+<https://github.com/meltano/files-airflow.git>
    utilities:
    - name: airflow
      variant: apache
      pip_url: git+<https://github.com/meltano/airflow-ext.git@main> apache-airflow==2.10.5
        psycopg2-binary --constraint 
        <https://raw.githubusercontent.com/apache/airflow/constraints-2.10.5/constraints-no-providers-${MELTANO__PYTHON_VERSION}.txt>
      config:
        core:
          dags_are_paused_at_creation: true
          executor: LocalExecutor
        webserver:
          web_server_port: 8080
        database:
          sql_alchemy_conn: <postgresql://postgres:postgres@airflow-metadata-db/airflow>
Here's the full traceback:
v
Are you using https://hub.meltano.com/utilities/airflow/ ? Can you share your meltano.yml?
There's literally 158 projects running with a quarter million executions in the last 3 months with this one!
I see it now, you did share a snippet of it, hmm I can't dive right now but there's some reason it's working for so many
m
@visch Yep I am using the one that's on the official meltano git repo. I don't recall ever seeing a setting for configuring where meltano directs airflow to dump its logs, so that's why it's so confusing to me as to why the logs are getting dumped into the airflow cfg file.