matthew_van_zanten
07/13/2023, 5:39 PMThe scheduler does not appear to be running. Last heartbeat was received 1 week ago.
in the web ui and as you can see in the screenshot the scheduler is running.matthew_van_zanten
07/13/2023, 5:42 PMmatthew_van_zanten
07/13/2023, 5:45 PMedgar_ramirez_mondragon
07/13/2023, 5:47 PMmatthew_van_zanten
07/13/2023, 5:52 PMcompress_serialized_dags = False
to the [core]
section of airflow.cfg
. Ill add that and rebuild my container, then deploy to test, thank youmatthew_van_zanten
07/13/2023, 6:16 PMDAG Serialization
feature?edgar_ramirez_mondragon
07/13/2023, 6:28 PMmatthew_van_zanten
07/13/2023, 8:14 PMmatthew_van_zanten
07/13/2023, 8:23 PMAIRFLOW__DATABASE__SQL_ALCHEMY_CONN
env var but thats not doing anyting, the airflow.cfg seems to just point back to sqlliteuser
07/13/2023, 8:39 PMmatthew_van_zanten
07/13/2023, 9:48 PMmeltano config airflow set database sql_alchemy_conn <postgresql://airflow_user:airflow_password@10.30.0.5/airflow_db>
which updated my meltano.yml
file. I then try to initialize the database with meltano invoke airflow db init
however it returns the following, which does not provide me with an error to continue debugging
root@e081afcf159a:/project# meltano invoke airflow db init
2023-07-13T21:44:26.730369Z [info ] Environment 'dev' is active
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.
'database'
matthew_van_zanten
07/13/2023, 9:49 PMmatthew_van_zanten
07/14/2023, 4:39 PMAIRFLOW__CORE__SQL_ALCHEMY_CONN: '<postgresql://airflow_user:airflow_password@10.30.0.5/airflow_db>'
with the official airflow container and it works 100% but its not working with the meltano container.douwe_maan
07/14/2023, 5:05 PMmeltano --log-level=debug [etc]
so we can see exactly where the error originates?matthew_van_zanten
07/14/2023, 5:06 PMAIRFLOW__CORE__LOGGING_LEVEL: DEBUG
however that does nto seem to workmatthew_van_zanten
07/14/2023, 5:06 PMmatthew_van_zanten
07/14/2023, 5:12 PMdouwe_maan
07/14/2023, 5:15 PMmatthew_van_zanten
07/14/2023, 5:17 PMmatthew_van_zanten
07/14/2023, 5:17 PMsql_alchemy_conn
is not being set in the yml but is rather being set by the environment variable AIRFLOW__CORE__SQL_ALCHEMY_CONN: '<postgresql://airflow_user:airflow_password@10.30.0.5/airflow_db>'
douwe_maan
07/14/2023, 5:19 PMmatthew_van_zanten
07/14/2023, 5:20 PMmatthew_van_zanten
07/14/2023, 5:20 PMconfig:
database:
#sql_alchemy_conn: <postgresql://airflow_user:airflow_password@10.30.0.5/airflow_db>
core:
compress_serialized_dags: False
douwe_maan
07/14/2023, 5:21 PMmatthew_van_zanten
07/14/2023, 5:21 PMmatthew_van_zanten
07/14/2023, 5:24 PMmatthew_van_zanten
07/14/2023, 5:24 PMdouwe_maan
07/14/2023, 5:25 PMmatthew_van_zanten
07/14/2023, 5:26 PMThe scheduler does not appear to be running. Last heartbeat was received 1 hour ago.
matthew_van_zanten
07/14/2023, 5:26 PMdouwe_maan
07/14/2023, 5:27 PMpat_nadolny
07/14/2023, 5:29 PMdouwe_maan
07/14/2023, 5:29 PMmatthew_van_zanten
07/14/2023, 5:30 PMmatthew_van_zanten
07/14/2023, 5:30 PMmatthew_van_zanten
07/14/2023, 5:31 PMdouwe_maan
07/14/2023, 5:33 PMdouwe_maan
07/14/2023, 5:33 PMmatthew_van_zanten
07/14/2023, 5:34 PMdouwe_maan
07/14/2023, 5:34 PMmatthew_van_zanten
07/14/2023, 5:34 PMdouwe_maan
07/14/2023, 5:35 PMmatthew_van_zanten
07/14/2023, 5:38 PM- meltano-ui
- airflow-webserver
- airflow-scheduler
so this makes reading the documentation and turning it into containers a bit tricky, I assume all these commands are for the meltano container thendouwe_maan
07/14/2023, 5:41 PMmatthew_van_zanten
07/14/2023, 5:42 PMmatthew_van_zanten
07/14/2023, 6:14 PMExecutable 'airflow_invoker' could not be found. Utility 'airflow' may not have been installed yet using `meltano install utility airflow`, or the executable name may be incorrect.
I ran the following
docker run -v $(pwd):/projects -w /projects meltano/meltano add utility airflow
docker run -v $(pwd):/projects -w /projects meltano/meltano install utility airflow
Then built the container
docker build . --tag meltano-mine
Then I run the latest build via compose
docker-compose up -d
And then the error appears, it happened back when I was using airflow orchestrator but eventually with enough try, try try, try, it works with no noticable change in my processmatthew_van_zanten
07/14/2023, 6:15 PMdouwe_maan
07/14/2023, 6:21 PMmatthew_van_zanten
07/14/2023, 6:23 PMFROM meltano/meltano:latest
matthew_van_zanten
07/14/2023, 6:31 PM./.meltano\utilities
and reran
docker run -v $(pwd):/projects -w /projects meltano/meltano add utility airflow
docker run -v $(pwd):/projects -w /projects meltano/meltano install utility airflow
Then again rebuilt the container and its working now. This is what I mean by having to try again until it magically worksmatthew_van_zanten
07/14/2023, 7:03 PMAIRFLOW__DATABASE__SQL_ALCHEMY_CONN
environment variable, it still defaults to SQLite
.
When we put the db config directly into meltano.yml I get the following errors. (Containers continuously refreshing)matthew_van_zanten
07/14/2023, 7:06 PMModuleNotFoundError: No module named 'psycopg2' cmd=airflow --help stdio_stream=stderr
Looks like the key errormatthew_van_zanten
07/14/2023, 7:09 PMmeltano install utility airflow
command when its downloading packages in its venv
douwe_maan
07/14/2023, 7:27 PMedgar_ramirez_mondragon
07/14/2023, 7:31 PMpsycopg2-binary
to avoid having to build it in the containermatthew_van_zanten
07/14/2023, 7:42 PMmatthew_van_zanten
07/14/2023, 8:40 PMdb init
. Now I can test it out! Thank you!douwe_maan
07/14/2023, 8:41 PMmatthew_van_zanten
07/18/2023, 7:28 PMMELTANO_DATABASE_URI
env variable, airflow database with the AIRFLOW_DATABASE_SQL_ALCHEMY_CONN
env variable, and finally AIRFLOW__CORE__COMPRESS_SERIALIZED_DAGS
set to false
like we discussed above. All systems fire up without errors however they do not seem to be registering jobs or schedules when I run the job and schedule creation commands.
This is what I run on the scheduler:
root@4dfc79fff9de:/project# meltano job add tap-stackoverflow-sampledata-to-target-jsonl --tasks "tap-stackoverflow-sampledata target-jsonl"
2023-07-18T19:16:42.481438Z [info ] The default environment 'dev' will be ignored for `meltano job`. To configure a specific environment, please use the option `--environment=<environment name>`.
Added job tap-stackoverflow-sampledata-to-target-jsonl: ['tap-stackoverflow-sampledata target-jsonl']
root@4dfc79fff9de:/project# meltano schedule add tap-stackoverflow-sampledata-to-target-jsonl --extractor tap-stackoverflow-sampledata --loader target-jsonl --transform run --interval "@daily"
2023-07-18T19:16:50.613843Z [info ] The default environment 'dev' will be ignored for `meltano schedule`. To configure a specific environment, please use the option `--environment=<environment name>`.
/venv/lib/python3.9/site-packages/meltano/core/settings_service.py:445: RuntimeWarning: Unknown setting 'start_date' - the default value `None` will be used
value, metadata = self.get_with_metadata(*args, **kwargs)
Scheduled elt 'tap-stackoverflow-sampledata-to-target-jsonl' at @daily
And then when I go to airflow I do not get any jobs or schedules appearing. What is is that I need to do? (this worked when it was using the SQLite) just not working with postgres as the backend database.douwe_maan
07/18/2023, 7:29 PMdouwe_maan
07/18/2023, 7:30 PMmatthew_van_zanten
07/18/2023, 7:30 PMmatthew_van_zanten
07/18/2023, 7:31 PMmatthew_van_zanten
07/18/2023, 7:31 PMmatthew_van_zanten
07/18/2023, 7:33 PMdag bag
and I can confirm that all 3 running containers have the latest version of the meltano.yml
, they all use the same imagematthew_van_zanten
07/18/2023, 7:42 PMdouwe_maan
07/18/2023, 8:33 PM{manager.py:160} INFO - Launched DagFileProcessorManager with pid: 113
in the logs, are you able to find hte logs for the dag_file_processor? I believe they may be under .meltano/utilities/airflow/logs
matthew_van_zanten
07/18/2023, 8:44 PMdag_file_processor
log says this
[2023-07-18 19:53:20,599] {manager.py:480} INFO - Processing files using up to 2 processes at a time
[2023-07-18 19:53:20,599] {manager.py:481} INFO - Process each file at most once every 30 seconds
[2023-07-18 19:53:20,599] {manager.py:482} INFO - Checking for new files in /project/orchestrate/airflow/dags every 300 seconds
[2023-07-18 19:53:20,599] {manager.py:690} INFO - Searching for files in /project/orchestrate/airflow/dags
[2023-07-18 19:53:20,600] {manager.py:693} INFO - There are 0 files in /project/orchestrate/airflow/dags
[2023-07-18 19:58:21,213] {manager.py:690} INFO - Searching for files in /project/orchestrate/airflow/dags
[2023-07-18 19:58:21,213] {manager.py:693} INFO - There are 0 files in /project/orchestrate/airflow/dag
And the folder /project/orchestrate/airflow/dags
does not exist, I have a folder called /project/orchestrate/dags
with the following files
root@da7e06b353e8:/project/orchestrate/dags# ls
__pycache__ 'meltano (files-airflow).py' meltano.py
So it seems like the project is pointed to the wrong folder?matthew_van_zanten
07/18/2023, 8:44 PMdouwe_maan
07/18/2023, 8:56 PMairflow/dags
, and keep only one of the 2 .py
files (the most recently created one)?matthew_van_zanten
07/18/2023, 9:02 PMmatthew_van_zanten
07/18/2023, 9:21 PMmatthew_van_zanten
07/18/2023, 10:36 PMExecutable 'airflow_invoker' could not be found. Utility 'airflow' may not have been installed yet using `meltano install utility airflow`, or the executable name may be incorrect.
What I am doing is:
1. Pull the Meltano code from the repo
2. pip install --upgrade pip
3. pip install "meltano"
4. meltano install
Installing 3 plugins...
Installing extractor 'tap-stackoverflow-sampledata'...
Installing loader 'target-jsonl'...
Installed loader 'target-jsonl'
Installing utility 'airflow'...
Installed extractor 'tap-stackoverflow-sampledata'
Installed utility 'airflow'
Installed 3/3 plugins
And then finally I run docker build and upload to ECR.
No matter how many times I run this build the container when it gets deployed gets that error message
Executable 'airflow_invoker' could not be found. Utility 'airflow' may not have been installed yet using `meltano install utility airflow`, or the executable name may be incorrect.
matthew_van_zanten
07/18/2023, 10:37 PMmatthew_van_zanten
07/18/2023, 11:02 PM#14 [9/9] RUN ls -la .meltano/utilities/airflow/venv/bin
#14 0.252 total 228
#14 0.252 drwxr-xr-x 3 root root 4096 Jul 18 22:57 .
#14 0.252 drwxr-xr-x 5 root root 4096 Jul 18 22:57 ..
#14 0.252 drwxr-xr-x 2 root root 4096 Jul 18 22:57 __pycache__
#14 0.252 -rw-r--r-- 1 root root 2285 Jul 18 22:56 activate
#14 0.252 -rwxr-xr-x 1 root root 3548 Jul 18 22:56 activate-global-python-argcomplete
#14 0.252 -rw-r--r-- 1 root root 1552 Jul 18 22:56 activate.csh
#14 0.252 -rw-r--r-- 1 root root 3115 Jul 18 22:56 activate.fish
#14 0.252 -rw-r--r-- 1 root root 2800 Jul 18 22:56 <http://activate.nu|activate.nu>
#14 0.252 -rw-r--r-- 1 root root 1650 Jul 18 22:56 activate.ps1
#14 0.252 -rw-r--r-- 1 root root 1337 Jul 18 22:56 activate_this.py
#14 0.252 -rwxr-xr-x 1 root root 291 Jul 18 22:56 airflow
#14 0.252 -rwxr-xr-x 1 root root 289 Jul 18 22:56 airflow_extension
#14 0.252 -rwxr-xr-x 1 root root 323 Jul 18 22:56 airflow_invoker
#14 0.252 -rwxr-xr-x 1 root root 289 Jul 18 22:56 alembic
#14 0.252 -rwxr-xr-x 1 root root 291 Jul 18 22:56 cmark
#14 0.252 -rwxr-xr-x 1 root root 288 Jul 18 22:56 connexion
#14 0.252 -rwxr-xr-x 1 root root 292 Jul 18 22:56 docutils
#14 0.252 -rwxr-xr-x 1 root root 290 Jul 18 22:56 email_validator
#14 0.252 -rwxr-xr-x 1 root root 297 Jul 18 22:56 fabmanager
#14 0.252 -rwxr-xr-x 1 root root 284 Jul 18 22:56 flask
#14 0.252 -rwxr-xr-x 1 root root 1727 Jul 18 22:56 get_objgraph
#14 0.252 -rwxr-xr-x 1 root root 293 Jul 18 22:56 gunicorn
#14 0.252 -rwxr-xr-x 1 root root 280 Jul 18 22:56 httpx
#14 0.252 -rwxr-xr-x 1 root root 289 Jul 18 22:56 jsonschema
#14 0.252 -rwxr-xr-x 1 root root 289 Jul 18 22:56 mako-render
#14 0.252 -rwxr-xr-x 1 root root 296 Jul 18 22:56 markdown-it
#14 0.252 -rwxr-xr-x 1 root root 290 Jul 18 22:56 markdown_py
#14 0.252 -rwxr-xr-x 1 root root 320 Jul 18 22:56 normalizer
#14 0.252 -rwxr-xr-x 1 root root 291 Jul 18 22:56 nvd3
#14 0.252 -rwxr-xr-x 1 root root 297 Jul 18 22:56 pip
#14 0.252 -rwxr-xr-x 1 root root 297 Jul 18 22:56 pip3
#14 0.252 -rwxr-xr-x 1 root root 297 Jul 18 22:56 pip3.10
#14 0.252 -rwxr-xr-x 1 root root 298 Jul 18 22:56 pybabel
#14 0.252 -rwxr-xr-x 1 root root 291 Jul 18 22:56 pygmentize
#14 0.252 lrwxrwxrwx 1 root root 16 Jul 18 22:56 python -> /usr/bin/python3
#14 0.252 -rwxr-xr-x 1 root root 2631 Jul 18 22:56 python-argcomplete-check-easy-install-script
#14 0.252 -rwxr-xr-x 1 root root 383 Jul 18 22:56 python-argcomplete-tcsh
#14 0.252 lrwxrwxrwx 1 root root 6 Jul 18 22:56 python3 -> python
#14 0.252 lrwxrwxrwx 1 root root 6 Jul 18 22:56 python3.10 -> python
#14 0.252 -rwxr-xr-x 1 root root 1993 Jul 18 22:56 register-python-argcomplete
#14 0.252 -rwxr-xr-x 1 root root 668 Jul 18 22:56 rst2html.py
#14 0.252 -rwxr-xr-x 1 root root 790 Jul 18 22:56 rst2html4.py
#14 0.252 -rwxr-xr-x 1 root root 1135 Jul 18 22:56 rst2html5.py
#14 0.252 -rwxr-xr-x 1 root root 867 Jul 18 22:56 rst2latex.py
#14 0.252 -rwxr-xr-x 1 root root 690 Jul 18 22:56 rst2man.py
#14 0.252 -rwxr-xr-x 1 root root 856 Jul 18 22:56 rst2odt.py
#14 0.252 -rwxr-xr-x 1 root root 1794 Jul 18 22:56 rst2odt_prepstyles.py
#14 0.252 -rwxr-xr-x 1 root root 675 Jul 18 22:56 rst2pseudoxml.py
#14 0.252 -rwxr-xr-x 1 root root 711 Jul 18 22:56 rst2s5.py
#14 0.252 -rwxr-xr-x 1 root root 947 Jul 18 22:56 rst2xetex.py
#14 0.252 -rwxr-xr-x 1 root root 676 Jul 18 22:56 rst2xml.py
#14 0.252 -rwxr-xr-x 1 root root 744 Jul 18 22:56 rstpep2html.py
#14 0.252 -rwxr-xr-x 1 root root 291 Jul 18 22:56 slugify
#14 0.252 -rwxr-xr-x 1 root root 292 Jul 18 22:56 sqlformat
#14 0.252 -rwxr-xr-x 1 root root 285 Jul 18 22:56 tabulate
#14 0.252 -rwxr-xr-x 1 root root 663 Jul 18 22:56 undill
#14 0.252 -rwxr-xr-x 1 root root 284 Jul 18 22:56 wheel
#14 0.252 -rwxr-xr-x 1 root root 284 Jul 18 22:56 wheel-3.10
#14 0.252 -rwxr-xr-x 1 root root 284 Jul 18 22:56 wheel3
#14 0.252 -rwxr-xr-x 1 root root 284 Jul 18 22:56 wheel3.10
douwe_maan
07/18/2023, 11:36 PMmeltano install
needs to happen inside the container, because otherwise it’ll link the executables to the Python executable on your host system rather than the one that exists inside the container.matthew_van_zanten
07/18/2023, 11:37 PMmatthew_van_zanten
07/18/2023, 11:41 PMmatthew_van_zanten
07/18/2023, 11:41 PMmatthew_van_zanten
07/18/2023, 11:47 PM