Hello channel! Running airflow (webserver or sched...
# docker
p
Hello channel! Running airflow (webserver or scheduler) using the docker image (meltano/meltano:latest) gives this error:
[Errno 2] No such file or directory: '/project/.meltano/run/airflow/airflow.cfg'
I tried the following options, but all of them give the same error: 1. Using docker-compose.yml generated by
meltano add files docker-compose
: The meltano-ui service starts okay but both airflow-scheduler and airflow-webserver fail with the above error (missing
airflow.cfg
) 2.
docker pull meltano/meltano:latest
then
docker run -v $(pwd):/project -w /project meltano/meltano invoke airflow webserver
3. Halting the container by using a custom entrypoint (
docker run -it --entrypoint /bin/bash -v $(pwd):/project -w /project meltano/meltano:latest
) and then running
meltano invoke airflow webserver
on docker terminal. I tried this with multiple docker images (e.g. latest-python3.8, v1.67.0, v1.65.0, v1.63.0 etc.), but all of them give me the same error, so suspect I am not doing something right, but I don’t know what. I would expect the docker image to read my meltano.yml file and install everything by running
meltano install
, and also creating necessary airflow.cfg file. (I know that
meltano invoke airflow <param>
should automatically create airflow.cfg file, but not sure why it is not doing) The airflow webserver and scheduler start okay if I run them on my machine directly (i.e. not on docker).
d
@pankaj_saini That's certainly not right 😕 Can you run
meltano invoke airflow webserver
again, but with debug logging and share the result?
meltano --log-level=debug invoke airflow webserver
That should tell us some more about how
airflow.cfg
is (supposed to be) constructed and how it is passed to Airflow
p
Hi @douwe_maan, thanks for responding so quickly. I ran
meltano --log-level=debug invoke airflow webserver
directly on docker terminal (option 3 in my main post above). Here is the output:
``` [2021-02-09 212730,773] [7|MainThread|root] [DEBUG] Creating engine <meltano.core.project.Project object at 0x7f58227c7e10>@sqlite:////project/.meltano/meltano.db [2021-02-09 212731,049] [7|MainThread|urllib3.connectionpool] [DEBUG] Starting new HTTPS connection (1): www.meltano.com:443 [2021-02-09 212732,016] [7|MainThread|urllib3.connectionpool] [DEBUG] https://www.meltano.com:443 "GET /discovery.yml?project_id=8ff07449-d514-4f96-9110-483eba690a44 HTTP/1.1" 200 83662 [2021-02-09 212733,165] [7|MainThread|root] [DEBUG] Invoking: ['/project/.meltano/orchestrators/airflow/venv/bin/airflow', '--help'] [2021-02-09 212733,167] [7|MainThread|root] [DEBUG] Env: {'TAP_HUBSPOT_HAPIKEY': '***************', 'TAP_HUBSPOT_REDIRECT_URI': '', 'TAP_HUBSPOT_CLIENT_ID': '', 'TAP_HUBSPOT_CLIENT_SECRET': '', 'TAP_HUBSPOT_REFRESH_TOKEN': '', 'TARGET_REDSHIFT_HOST': '*********, 'TARGET_REDSHIFT_PORT': '****', 'TARGET_REDSHIFT_USER': '*****', 'TARGET_REDSHIFT_PASSWORD': '*******', 'TARGET_REDSHIFT_DBNAME': '******', 'TARGET_REDSHIFT_DEFAULT_TARGET_SCHEMA': '****', 'TARGET_REDSHIFT_AWS_ACCESS_KEY_ID': '****', 'TARGET_REDSHIFT_AWS_SECRET_ACCESS_KEY': '*******', 'TARGET_REDSHIFT_S3_BUCKET': '*****', 'HOSTNAME': '****', 'PYTHON_VERSION': '3.6.12', 'PWD': '/project', 'HOME': '/root', 'LANG': 'C.UTF-8', 'GPG_KEY': '*******', 'TERM': 'xterm', 'SHLVL': '1', 'PYTHON_PIP_VERSION': '21.0', 'PYTHON_GET_PIP_SHA256': 'ffb67da2e976f48dd29714fc64812d1ac419eb7d48079737166dd95640d1debd', 'PYTHON_GET_PIP_URL': 'https://github.com/pypa/get-pip/raw/8cc88aca7d9775fce279e8b84ef163cf1d3e8a2e/get-pip.py', 'PATH': '/project/.meltano/orchestrators/airflow/venv/bin/usr/local/bin/usr/local/sbin/usr/local/bin/usr/sbin/usr/bin/sbin:/bin', 'NODE_VERSION': '10', '_': '/usr/local/bin/meltano', 'MELTANO_JOB_TRIGGER': 'cli', 'MELTANO_PROJECT_ROOT': '/project', 'MELTANO_SEND_ANONYMOUS_USAGE_STATS': 'true', 'MELTANO_PROJECT_ID': '8ff07449-d514-4f96-9110-483eba690a44', 'MELTANO_DATABASE_URI': 'sqlite:////project/.meltano/meltano.db', 'MELTANO_DATABASE_MAX_RETRIES': '3', 'MELTANO_DATABASE_RETRY_TIMEOUT': '5', 'MELTANO_PROJECT_READONLY': 'false', 'MELTANO_DISCOVERY_URL': 'https://www.meltano.com/discovery.yml', 'MELTANO_ELT_BUFFER_SIZE': '10485760', 'MELTANO_CLI_LOG_LEVEL': 'debug', 'MELTANO_UI_BIND_HOST': '0.0.0.0', 'MELTANO_API_HOSTNAME': '0.0.0.0', 'MELTANO_UI_BIND_PORT': '5000', 'MELTANO_API_PORT': '5000', 'PORT': '5000', 'MELTANO_UI_SESSION_COOKIE_SECURE': 'false', 'MELTANO_UI_SECRET_KEY': 'thisisnotapropersecretkey', 'MELTANO_UI_PASSWORD_SALT': 'b4c124932584ad6e69f2774a0ae5c138', 'MELTANO_UI_WORKERS': '4', 'WORKERS': '4', 'WEB_CONCURRENCY': '4', 'MELTANO_UI_FORWARDED_ALLOW_IPS': '127.0.0.1', 'FORWARDED_ALLOW_IPS': '127.0.0.1', 'MELTANO_UI_READONLY': 'false', 'MELTANO_READONLY': 'false', 'MELTANO_UI_AUTHENTICATION': 'false', 'MELTANO_AUTHENTICATION': 'false', 'MELTANO_UI_ANONYMOUS_READONLY': 'false', 'MELTANO_UI_NOTIFICATION': 'false', 'MELTANO_NOTIFICATION': 'false', 'MELTANO_UI_ANALYSIS': 'true', 'MAIL_SERVER': 'localhost', 'MELTANO_MAIL_SERVER': 'localhost', 'MAIL_PORT': '1025', 'MELTANO_MAIL_PORT': '1025', 'MAIL_DEFAULT_SENDER': '"Meltano" <bot@meltano.com>', 'MELTANO_MAIL_DEFAULT_SENDER': '"Meltano" <bot@meltano.com>', 'MAIL_USE_TLS': 'false', 'MELTANO_MAIL_USE_TLS': 'false', 'MAIL_DEBUG': 'false', 'MELTANO_MAIL_DEBUG': 'false', 'MELTANO_OAUTH_SERVICE_PROVIDERS': 'all', 'MELTANO_TRACKING_IDS_CLI': 'UA-132758957-3', 'MELTANO_CLI_TRACKING_ID': 'UA-132758957-3', 'MELTANO_TRACKING_IDS_UI': 'UA-132758957-2', 'MELTANO_UI_TRACKING_ID': 'UA-132758957-2', 'MELTANO_TRACKING_IDS_UI_EMBED': 'UA-132758957-6', 'MELTANO_EMBED_TRACKING_ID': 'UA-132758957-6', 'MELTANO_ORCHESTRATOR_NAME': 'airflow', 'MELTANO_ORCHESTRATOR_NAMESPACE': 'airflow', 'MELTANO_ORCHESTRATOR_VARIANT': 'original', 'AIRFLOW__CORE__DAGS_FOLDER': '/project/orchestrate/dags', 'AIRFLOW_CORE_DAGS_FOLDER': '/project/orchestrate/dags', 'MELTANO_ORCHESTRAT_CORE…
d
It looks like
No such file or directory: '/project/.meltano/run/airflow/airflow.cfg'
is just a red herring; the real underlying error is buried in those logs:
Copy code
meltano.core.plugin_invoker.ExecutableNotFoundError: Executable 'airflow' could not be found. Orchestrator 'airflow' may not have been installed yet using `meltano install orchestrator airflow`, or the executable name may be incorrect.
And before that:
Copy code
FileNotFoundError: [Errno 2] No such file or directory: '/project/.meltano/orchestrators/airflow/venv/bin/airflow': '/project/.meltano/orchestrators/airflow/venv/bin/airflow'
Now you'll probably find that that file actually exists, but the reason it's not working is that if you
cat
that file and look at the first line, which points at the executable to run it with, that path won't exist from the perspective of the Docker container.
To resolve this, the plugin(s) need to be installed from the same environment as the Docker image, so that all the absolute paths work out. To do that, you can run
meltano install
inside your container
Try
docker-compose exec meltano-ui meltano install
If you do that, you'll find that
meltano invoke
will no longer work outside Docker. You really have to choose either Docker or non-Docker and stick with it 🙂
That's not documented clearly; I'll make a note of that!
Let me know if that works or if my hunch is wrong 😄
p
I see what you mean.
docker-compose exec meltano-ui meltano install
installed all the plugins (for docker container) , and airflow-webserver and airflow-scheduler are able to use this
.meltano/orchestrators/airflow/venv/bin/airflow
file to correctly launch airflow. While this worked okay for me. But should the docker image entrypoint not have
meltano install
first, which can use the projects meltano.yml file (from the mounted volume) and install all the plugins before launching meltano-ui, airflow-webserver or airflow-scheduler? If you think this is something which has to be fixed, I can temporarily pass a custom entrypoint to do a
meltano install
on the
meltano-ui
service and make both
airflow-scheduler
and
airflow-webserver
dependent upon
meltano-ui
service.
d
If you containerize your entire project (https://meltano.com/docs/containerization.html),
meltano install
will be part of the `Dockerfile`: https://gitlab.com/meltano/files-docker/-/blob/master/bundle/Dockerfile#L12. Generally, it's something you'd explicitly run to prepare the environment before running
meltano elt
or
meltano invoke
, not something to be run every time those commands are run, since re-installing all plugins every time would unnecessarily slow things down significantly.
So generally you'd only
meltano install
when you make changes to
meltano.yml
, or when you know no plugins have been installed yet. It's just that when you're using the docker-compose file and a mounted project directory, those installed plugins won't be valid inside the container, so we should document that explicitly or raise a more useful error.
When you don't mount the project directory into a foreign environment, it'll be a non-issue 🙂
p
Thanks Douwe, this makes sense. Thanks for your help.