Does anyone know where I can find the airflow.cfg ...
# random
g
Does anyone know where I can find the airflow.cfg file? I am trying to enable password authentication for the Airflow webserver on my Meltano project.
d
Meltano automatically generates
airflow.cfg
based on its own config for Airflow! You can set any Airlfow config value directly using
meltano config airflow set
and in
meltano.yml
g
Amazing thank you! I'll try this out
Using the Meltano UI how would one then set Authentication to True or enable airflow authentication in Meltano? I know how to do so by directly editing the files, however, want to follow the best practice for Meltano.
d
Airflow cannot currently be configured through Meltano UI
g
Alright, so my best bet seems to be locating the airflow.cfg file and editing it to enable authentication for the webserver
d
So
meltano config
and
meltano.yml
are the way to go. Once you have a value in
meltano.yml
, it will also show up in
meltano config airflow list
and you can override it using an env var
That won't work,
airflow.cfg
will always be regenerated based on Meltano's own understanding of the Airflow config 🙂
g
So what command would set Authenticate to True (it is set to false in airflow.cfg) I understand that airflow.cfg will be reset, but how will airflow know to use authentication if airflow.cfg is always set to False?
d
The Airflow setting is called
webserver.autheticate
, right? Try this:
Copy code
meltano config airflow set webserver authenticate true
g
Will do! Thanks for the help
d
That'll add
webserver: authenticate: true
to
meltano.yml
, and next time you run
meltano invoke airflow ...
you'll see
airflow.cfg
automatically updated with the new setting
g
Copy code
[2021-04-28 20:15:15,887] {configuration.py:376} WARNING - section/key [webserver/auth_backend] not found in config
Traceback (most recent call last):
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/__init__.py", line 61, in load_login
    auth_backend = conf.get('webserver', 'auth_backend')
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/configuration.py", line 380, in get
    raise AirflowConfigException(
airflow.exceptions.AirflowConfigException: section/key [webserver/auth_backend] not found in config

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/bin/airflow", line 37, in <module>
    args.func(args)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/utils/cli.py", line 81, in wrapper
    return f(*args, **kwargs)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/bin/cli.py", line 1179, in webserver
    app = cached_app_rbac(None) if settings.RBAC else cached_app(None)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/www/app.py", line 244, in cached_app
    app = create_app(config, testing)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/www/app.py", line 84, in create_app
    airflow.load_login()
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/__init__.py", line 62, in load_login
    except conf.AirflowConfigException:
AttributeError: 'AirflowConfigParser' object has no attribute 'AirflowConfigException'
After executing the command, I ran:
Copy code
meltano invoke airflow webserver -p 5000 -D
And received the error above.
d
Sounds like Airflow wants you to also set
webserver.auth_backend
g
Copy code
[2021-04-28 20:26:02,787] {__init__.py:76} CRITICAL - Cannot import authentication module airflow.contrib.auth.backends.password_auth. Please correct your authentication backend or disable authentication: No module named 'flask_bcrypt'
Traceback (most recent call last):
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/__init__.py", line 71, in load_login
    login = import_module(auth_backend)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/contrib/auth/backends/password_auth.py", line 29, in <module>
    from flask_bcrypt import generate_password_hash, check_password_hash
ModuleNotFoundError: No module named 'flask_bcrypt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/bin/airflow", line 37, in <module>
    args.func(args)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/utils/cli.py", line 81, in wrapper
    return f(*args, **kwargs)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/bin/cli.py", line 1179, in webserver
    app = cached_app_rbac(None) if settings.RBAC else cached_app(None)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/www/app.py", line 244, in cached_app
    app = create_app(config, testing)
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/www/app.py", line 84, in create_app
    airflow.load_login()
  File "/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/airflow/__init__.py", line 82, in load_login
    raise AirflowException("Failed to import authentication backend")
airflow.exceptions.AirflowException: Failed to import authentication backend
I ran the command:
Copy code
meltano config airflow set webserver auth_backend airflow.contrib.auth.backends.password_auth
Which was following the link you sent. However it looks like it is failing to import the necessary modules.
d
@gunnar Which Airflow version are you using?
g
1.10.14 Ah, and I may need to update meltano as well. I will upgrade meltano now.
I think this answer applies: https://stackoverflow.com/a/63194431. You'll have to modify your
pip_url
for
airflow
in
meltano.yml
to include
[password]
And then
meltano install orchestrator airflow
again
g
I did receive the same error (also airflow seemed to be at the most up to date version already). So do you suggest I follow the second link and run the command:
Copy code
pip install 'apache-airflow[password]'
d
Just running the command won't work, you'll want to update the pip_url in meltano.yml
g
Ah okay!
Current:
Copy code
orchestrators:
  - name: airflow
    pip_url: apache-airflow==1.10.14 --constraint <https://raw.githubusercontent.com/apache/airflow/constraints-1.10.14/constraints-3.6.txt>
    config:
      webserver:
        authenticate: true
        auth_backend: airflow.contrib.auth.backends.password_auth
Updated:
Copy code
orchestrators:
  - name: airflow
    pip_url: apache-airflow[password]==1.10.14 --constraint <https://raw.githubusercontent.com/apache/airflow/constraints-1.10.14/constraints-3.6.txt>
    config:
      webserver:
        authenticate: true
        auth_backend: airflow.contrib.auth.backends.password_auth
Does this look correct to you?
d
It does!
g
Only change is on the pip_url line
Awesome, I'll give it a shot!
hm.. I ran into the same error. Would I need to do any reinstallation after that change is made? I made that change and executed the command:
Copy code
meltano invoke airflow webserver -p 5000 -D
Which resulted in the error
g
Resulted with no errors! However, no authentication after launching the webserver. I assume I may need to add a user again or something
d
Yeah, I suggest looking at the Airflow docs
At least the configuration should be good now
g
Will do! Per usual @douwe_maan saving me tons of time with your amazing help!
d
Happy to help! We already have an issue to document this: https://gitlab.com/meltano/meltano/-/issues/2261
g
Last thing @douwe_maan Do you have any suggestions for creating a user. I am following the airflow documentation and I successfully receive the output:
Copy code
Admin user Gunnar created.
However when I list the users there are still no users and no authentication on the webserver. One of the outputs from meltano is:
Copy code
[2021-04-28 21:55:07,925] {manager.py:710} WARNING - No user yet created, use flask fab command to do it.
This output happens after executing airflows create user command, however after this I receive the Admin user created output.
d
Hmm, and you created the user using
meltano invoke airflow ...
as well?
g
Yes, I used this command to create the user:
Copy code
meltano invoke airflow users create ...
and to list users:
Copy code
meltano invoke airflow users list
d
OK that looks right
And that second command doesn't list your new user?
g
No it just lists an empty table, even after receiving output that the Admin user was created.
message has been deleted
message has been deleted
d
That's very odd. Airflow should store the user (and everything else) in the SQLite DB at
.meltano/orchestrators/airflow/airflow.db
.
If you look at the generated airflow.cfg you found, does it list that path for the
sql_alchemy_conn
setting?
g
Yes it lists a path for the variable
sql_alchemy_conn
Copy code
/.meltano/run/airflow/airflow.db
d
Hmm, I'd expect that to say
orchestrators
where it says
run
. Can you run
meltano config airflow
for me and share the result?
g
Here it is:
Copy code
{
  "core": {
    "dags_folder": "/home/ubuntu/meltano-projects/dev-klaviyo/orchestrate/dags",
    "plugins_folder": "/home/ubuntu/meltano-projects/dev-klaviyo/orchestrate/plugins",
    "sql_alchemy_conn": "sqlite:////home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/airflow.db",
    "load_examples": "False",
    "dags_are_paused_at_creation": "False"
  },
  "webserver": {
    "authenticate": "True",
    "auth_backend": "airflow.contrib.auth.backends.password_auth"
  }
}
d
And what's the path of the
airflow.cfg
you're looking at?
g
/.meltano/run/airflow/airflow.cfg
d
What Meltano version are you on?
meltano --version
will show you
g
meltano, version 1.72.0
d
Hmm ok, that's the latest
Is there a
.meltano/orchestrators/airflow/airflow.db
? Or only
.meltano/run/airflow/airflow.db
?
g
.meltano/run/airflow content:
.meltano/orchestrators/airflow content:
d
OK, so the issue appears to be that for some reason
airflow.cfg
ends up with
/.meltano/run/airflow/airflow.db
instead of the correct
/.meltano/orchestrators/airflow/airflow.db
Can you run
meltano --log-level=debug invoke airflow version
and share the log lines starting with
Updated section
?
There's no line that says
Updated section [core]
?
g
Copy code
[2021-04-28 22:16:44,596] [1319722|MainThread|root] [DEBUG]     Updated section [core] with {'dags_folder': '/home/ubuntu/meltano-projects/dev-klaviyo/orchestrate/dags', 'plugins_folder': '/home/ubuntu/meltano-projects/dev-klaviyo/orchestrate/plugins', 'sql_alchemy_conn': 'sqlite:////home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/airflow.db', 'load_examples': 'False', 'dags_are_paused_at_creation': 'False'}
Ah, I was sending updated section webserver
d
OK, so that looks right
Do you see the line
Saved '<path to config>'
?
g
Copy code
[2021-04-28 22:16:44,597] [1319722|MainThread|root] [DEBUG] Saved '/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/run/airflow/airflow.cfg'
d
And yet, if you open that file,
sql_alchemy_conn
does not match
sqlite:////home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/airflow.db
?
Do you see a log line starting with
Deleted configuration at
?
g
Now in
.meltano/run/airflow
There is no airflow.cfg
Yes
d
OK that makes sense then, it's expected to delete the configuration after a succesful run
Can you try
meltano invoke airflow config get-value core sql_alchemy_conn
?
g
Yes I will run that now
message has been deleted
d
Ah OK,
get-value
doesn't exist yet in that version of Airflow. Try
meltano invoke airflow config list | grep sql_alchemy_conn
then
g
Copy code
sql_alchemy_conn = sqlite:////home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/airflow.db
d
Ok so now it's set correctly! Can you try adding the user and then listing users another time?
g
I just added the user and listed the users and the same exact output after both commands outputted, unfortunately Successfully created user... Empty table
d
🤯
Can you connect manually with
/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/airflow.db
and see if there's any data in there?
Any tool capable of opening a SQLite database should do
g
Sure I will do this right now
These are the tables in the database
d
And I'm guessing
users
is going to be empty?
g
Yes it is, unfortunately
d
That's so weird. I wonder where
meltano invoke airflow users create
is creating the user if not there 😕
g
I just ran the users create command again and this is the output before it asks for a password:
Copy code
/home/ubuntu/meltano-projects/dev-klaviyo/.meltano/orchestrators/airflow/venv/lib/python3.8/site-packages/flask_sqlalchemy/__init__.py:812: UserWarning: Neither SQLALCHEMY_DATABASE_URI nor SQLALCHEMY_BINDS is set. Defaulting SQLALCHEMY_DATABASE_URI to "sqlite:///:memory:".
  warnings.warn(
[2021-04-29 17:05:12,045] {manager.py:710} WARNING - No user yet created, use flask fab command to do it.
[2021-04-29 17:05:12,708] {__init__.py:50} INFO - Using executor SequentialExecutor
[2021-04-29 17:05:12,709] {dagbag.py:417} INFO - Filling up the DagBag from /home/ubuntu/meltano-projects/dev-klaviyo/orchestrate/dags
d
UserWarning: Neither SQLALCHEMY_DATABASE_URI nor SQLALCHEMY_BINDS is set. Defaulting SQLALCHEMY_DATABASE_URI to "sqlite///memory:".
This is potentially relevant..
g
And then after I input the password when prompted I still receive the successful user created command.
Yeah... I think so too
d
Right, because it may just be storing it in memory, not an actual SQLite DB
Super weird
g
Yes agreed
Let's see how that relates to Meltano
Let's try
meltano config airflow set webserver rbac true
g
Alright I just executed that command
d
OK, let's try creating the user again
Maybe
airflow users create
doesn't work at all without
rbac
, note that the docs for https://airflow.apache.org/docs/apache-airflow/1.10.12/security.html?highlight=rbac#password say that the user should be created directly using python code
g
Aha! the user now shows up in the DB when running the command:
meltano invoke airflow users list
d
Awesome!
g
hmm.. however even after I relaunched the UI running the command:
meltano invoke airflow webserver -p 5000 -D
The user does not show up in the webserver users list and there is still no authentication.
d
wtf 😞
But the row does show up in the
users
table in that SQLite DB?
This is getting pretty deep into Airflow territory now rather than Meltano, so I'm not sure how to help much more
The only things Meltano "changes" about Airflow is that you have to use Meltano's own config instead of
airflow.cfg
, and that you have to invoke it through
meltano invoke airflow
. Beyond that everything you're seeing is Airflow behavior, and Airflow being confusing/buggy
So I'd suggest researching this some more from the Airflow perspective, just keeping in mind the Meltano config and invoke bits
g
Alright that sounds good. Thanks for the help!
@mohsin Just tagging you in this to checkout the thread. Pretty interesting and will give you good idea on airflow authentication // meltano. Still not resolved completely, but resolved a ton of errors in the process. @edward_ryan Just incase you want to check it out as well.
@mohsin
e
@connor_flynn and @jo_pearson I think this thread gives great context on our Airflow issues