For database taps/targets, should we allow the use...
# singer-tap-development
q
For database taps/targets, should we allow the user to set a sqlalchemy_url? Or should the the plugin assemble a URL from just user/pass/host/port components (or whatever the appropriate equivalent of those components is, e.g. snowflake account). And, directly related Q for the components, what about a driver+dialect? Came up here. Does the Meltano team have an official opinion on this?
e
driver+dialect usually imply installing extra dependencies, so I wouldn’t want to support it as config, at least not without a package extra and a clear error message when the extra deps required for the requested driver+dialect are not installed
h
The best argument I can see for allowing sqlalchemy connection strings as a config is that a lot of config can get tacked on as parameters to the end - and as far as I can tell, might work well with the tap/target blissfully unaware of it. The two examples that come to mind is authenticating to SQL Servers using managed identity (an overwhelmingly underdocumented feature), and connecting to oracle ADW which requires a “wallet” file. In both cases, this authentication is simply a parameter at the end of the connection string. Ideally we would like the tap/target developer to take accomodate this explicitly, but I don’t know if that is always realistic.
c
Some options and their implications: Option A: Don't support
sqlalchemy_url
and make sure that all taps and targets (including future unwritten ones) have first class support of all parameters that the dialect supports. This potentially leads to somewhat exponential growth of the combination of tap, target and dialect. Option B: Keep
sqlalchemy_url
as a supported setting and leave it up to the users to know where to look for parameter names that would be applicable for their chosen dialect. Option C: Agree on a minimum number of 'sane', common parameters that are needed to configure each dialect - i.e. these https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls NOTE: Not everything is configurable via a URL parameter in SQL Alchemy. E.g. to pass additional parameters to the DB-API driver's
connect()
function, SQL Alchemy Engines need to be passed a
connect_args
parameter on creation. Concrete example is Azure Synapse Dedicated SQL Pool (i.e. Microsoft Parallel Data Warehouse in the cloud) https://github.com/BuzzCutNorman/tap-mssql/pull/20 I think that a combination of Option B + C would be a good outcome.