Is anyone running meltano/airflow on heroku? I’m r...
# troubleshooting
n
Is anyone running meltano/airflow on heroku? I’m ready to deploy my first pipeline and am having some issues getting my procfile/environment right.
a
I haven't tried it on Heroku but I've used their platform in the past. What does your Procfile look like? Is it giving you any errors?
n
i’ve kinda borked my local through troubleshooting heheh so I’m blowing away my virtualenv and starting from scratch.
mostly errors that likely have to do with my poor understanding of where meltano ends and airflow beings
dependencies like psycopg2, airflwo config variables, etc
basically i have a heroku postgres, a heroku redis, and am trying to run a procfile that looks like this:
Copy code
web: cd meltano && meltano upgrade files && meltano install && meltano invoke airflow webserver
scheduler: cd meltano && meltano upgrade files && meltano install && meltano invoke airflow scheduler
a
Ah so
web
expects some sort of web server rather than a series of commands like that (even though the end result is a web server there) so it's probably upset about that. If you get rid of
web
does it still work with the
scheduler
process?
n
trying that now! sorry had to un-bork my environment after blowing it away.
ok back to getting
Copy code
2022-04-27T20:08:38.288243+00:00 app[scheduler.1]: 2022-04-27T20:08:38.288106Z [error    ] Traceback (most recent call last):
2022-04-27T20:08:38.288250+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/bin/airflow", line 5, in <module>
2022-04-27T20:08:38.288251+00:00 app[scheduler.1]: from airflow.__main__ import main
2022-04-27T20:08:38.288252+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/__init__.py", line 46, in <module>
2022-04-27T20:08:38.288252+00:00 app[scheduler.1]: settings.initialize()
2022-04-27T20:08:38.288254+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/settings.py", line 447, in initialize
2022-04-27T20:08:38.288254+00:00 app[scheduler.1]: configure_orm()
2022-04-27T20:08:38.288255+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/settings.py", line 222, in configure_orm
2022-04-27T20:08:38.288255+00:00 app[scheduler.1]: engine = create_engine(SQL_ALCHEMY_CONN, connect_args=connect_args, **engine_args)
2022-04-27T20:08:38.288256+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/sqlalchemy/engine/__init__.py", line 525, in create_engine
2022-04-27T20:08:38.288256+00:00 app[scheduler.1]: return strategy.create(*args, **kwargs)
2022-04-27T20:08:38.288256+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 87, in create
2022-04-27T20:08:38.288257+00:00 app[scheduler.1]: dbapi = dialect_cls.dbapi(**dbapi_args)
2022-04-27T20:08:38.288258+00:00 app[scheduler.1]: File "/app/meltano/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/sqlalchemy/dialects/postgresql/psycopg2.py", line 792, in dbapi
2022-04-27T20:08:38.288259+00:00 app[scheduler.1]: import psycopg2
2022-04-27T20:08:38.288259+00:00 app[scheduler.1]: ModuleNotFoundError: No module named 'psycopg2'
2022-04-27T20:08:38.288259+00:00 app[scheduler.1]: 
2022-04-27T20:08:38.289276+00:00 app[scheduler.1]: Command `airflow --help` failed
but this isn’t a heroku issue, i’ll try and sort this out on my local first
a
Yes! That's promising 😄
n
ok i think we’re in business! Turns out heroku didn’t hate me for that long webserver command, i just needed to properly install the airflow plugins like this: https://meltano.slack.com/archives/C01E3MUUB3J/p1635437273021600?thread_ts=1635437074.021500&amp;cid=C01E3MUUB3J
a
Ah nice!
n
i’ve learned that the issue with web commands is that they have to bind to the heroku port within 60 seconds
so if my command is too long, web breaks
a
Yeah it's really finnicky and kind of a hack lol
n
do you know of another way to do pre-work to set up the build?
i’m using the heroku python buildpack, which is actually pretty cool and installs your requirements for you. but i’m not an experienced heroku user so I’m struggling to find where I can run a command like
meltano install
that will actually run in my dyno environments.
Woof. So i pulled that
meltano install
step out to a custom buildpack (which i’d be happy to publish properly afterwards for any other heroku people out there), but now the binaries are all kinds of sideways =/ Basically Meltano is installing the virtual environments for its plugins into a temp directory that’s only available at build time. I think what i need to be able to do is specify the python path to meltano. Is this possible?
e
@nick_james I'm not too familiar with heroku deployments, but the installation directory of plugins isn't really temporary, rather the
.meltano/
directory (where plugins are installed) needs to be available after the build step. The python path you pointed to is within each plugin's venv, which need to be separate to avoid any dependency conflicts
n
i think something is just being weird with the way heroku is trying to symlink things during the build step. here’s the airflow venv file:
Copy code
cat /app/meltano/.meltano/orchestrators/airflow/venv/bin/airflow


#!/tmp/build_a52af35b/meltano/.meltano/orchestrators/airflow/venv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from airflow.__main__ import main
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())
Ok I have confirmed that if i update that hashbang in the airflow bin file to use the environment’s python path, things work again
e
nice!
c
@nick_james does this break after you redeploy though? I've got this exact problem