Ref my question about column comments today, I ske...
# singer-target-development
h
Ref my question about column comments today, I sketched out a small PoC for doing this in my target-oracle connector: https://github.com/radbrt/target-oracle/blob/8972c1e96478320ea1126a7eb401aed9ebe66500/target_oracle/sinks.py#L205-L215 Some notes: • Googling was pretty messy, but I concluded that SQL Server doesn’t support column comments in any meaningful way. Hence, I switched to testing on target-oracle. The exact same code should work on target-snowflake, and it doesn’t seem to fail on SQL Server - it just ignores it. • It doesn’t take a lot to add column comments when a table is being created, but adding column comments for new columns is a little more involved because the methods that create the new column only takes the column name and column type as argument, not the entire schema. Tests on the branch are passing, although the check that the column comment has been added is manual so far. https://github.com/radbrt/target-oracle/tree/col_comments
a
It doesn’t take a lot to add column comments when a table is being created, but adding column comments for new columns is a little more involved because the methods that create the new column only takes the column name and column type as argument, not the entire schema.
Valuable info! Thanks for sharing this back!
h
It won’t take an unreasonable amount of coding to get it working when adding columns either, but but there are probably going to be some design choices there and it’s definitely something the sdk should take into consideration. I’ll sketch something out over the weekend.
a
Very cool! And for cases where the database doesn't support column descriptions, does it cause any breakage or is that part just a no-op if not supported?
h
OK, so a little summary. I have added functionality in target-oracle to add column comments on new columns (still the same branch as above). In SQLConnector class, the method
prepare_table
must pass the variable
property_def
into prepare_column which must pass it into
_create_empty_column
Sqlalchemy isn’t very helpful with adding columns, and therefore not with adding column comments either. Developers of each tap will have to figure this out themselves. But it isn’t hard, and the SDK’s logic doesn’t have to bother with column comments. It might be natural to do something similar on alter column statements, which I think would follow the same pattern.
a
Great updates. Thanks! I reviewed the code from your branch link above. It looks like the create_empty_table() implementation is nicely generic. Would the
Column()
constructor similarly accept
comment=...
in the
_create_empty_column()
implementation here by chance? (I'm guessing no, but just want to confirm.)
h
It does indeed, but it does not seem to have any effect. In Oracle, you can’t tack a comment on to an add column statement, so it would have to be its own thing anyways. Snowflake accepts the comment in the same statement, but sqlalchemy still doesn’t seem to render it. Still, nothing fails by adding it.
a
Got it - thanks for explaining! Makes sense that Oracle just doesn't have a syntax for including it inline during "ADD COLUMN", and the maintainers of the Oracle DBAPI/SQLAlchemy driver didn't declare as a separate SQL 'execute' command (for understandable reasons).
Still, nothing fails by adding it.
Nice. In a base SDK implementation then, I think if column comments are enabled in the tap, we might as well send them in the constructor just in case the implementation is able to handle it automatically. And then developers can additionally provide custom implementations if need+interest is there to improve the implementation.
h
I think the small SDK change to pass
property_def
through to
_prepare_empty_column
is a nice first step, but we did mention something about default behaviour and particularly @visch didn’t want to auto-populate column comments from his description field. Not entirely sure what a good pattern for defaults should look like. I often come back to if-else’ing huge blocks, which probably isn’t the most elegant way.
v
Description field probably works for almost everyone as it's not used. I just hesitate using something someone else may be using. An easy fix/hacky thing is to do something like add a field called
_sdc_column_comments
and have it be an
object
it'd be an extension that we'd be defining. Description just fits so well that I can see using it here and telling anyone that doesn't use description properly to go make their own field like I'm describing
OpenAPI spec seems to use the description field as you describe, a real world examples like https://github.com/docusign/OpenAPI-Specifications/blob/master/admin.rest.swagger-v2.json#L32 seem to agree so why not?