Hello again I have a potential use case where I want Meltano Meltano #troubleshooting

Hello again, I have a potential use case where I w...

joshua_janicas

12/10/2024, 4:51 PM

Hello again, I have a potential use case where I want Meltano (tap-mssql => target-snowflake) to establish the table/column definitions inside Snowflake but I don't want to extract or load any data, nor update any state files. Is that possible through a command? (kind of like BACPAC vs DACPAC ins MSSQL or pg_dump with --schema-only
in PostGres. Would that be something funky like setting the batch size to 0 so no rows are ever extracted?

👀 1

joshua_janicas

12/10/2024, 4:53 PM

I need to create an "empty shell" of our raw data for DBT to create models on with no data in it. Kind of like an EMPTY_DB

joshua_janicas

12/10/2024, 4:58 PM

@BuzzCutNorman pinging you in case you have any knowledge on how to do this with the tap

BuzzCutNorman

12/10/2024, 5:24 PM

I don't know of a way to do this with command or config. Honestly I think this would be a new SDK/tap feature. cc @Edgar Ramírez (Arch.dev)

BuzzCutNorman

12/10/2024, 7:03 PM

I need to create an "empty shell" of our raw data for DBT to create models

I am a novice DBT users so this might be a well duh question. How would the built out tables assist in creating models. Is there a discovery command that populates the models for you? Just trying to better understand what the blank tables achieve.

joshua_janicas

12/10/2024, 7:07 PM

it's a weird intersection of our business intelligence software and how our upstream data is handled. we need to template reports with no data to give to our clients to publish. the rub is that each client is on a different version of our database which we import data from using tap-mssql and those versions may have schema differences. we can pin a client to a specific version of our elt pipeline but that results in potentially different tables (a column added, renamed...)

joshua_janicas

12/10/2024, 7:09 PM

the blank tables would allow us to updating the templating of the data being ingested as we can change the column definition

BuzzCutNorman

12/10/2024, 7:52 PM

Just trying to make sure I have this straight the BI software needs blank template reports that are based off DBT models generated from a discovery or run against the client database tables. I am guessing that is a gross simplification but just trying to understand the flow.

BuzzCutNorman

12/10/2024, 7:54 PM

I am seeing this as you create a blank table set based on the client database which you run DBT against creating the necessary tables for the report which are blank so you can create the template reports.

joshua_janicas

12/10/2024, 7:54 PM

Essentially that's the crux of it yes

BuzzCutNorman

12/10/2024, 8:09 PM

The schema messages that are used by a target to create a table are generated when streams class

sync

method is run. Which is kicked off by the tap class

sync_all

method.

sync_all

is called when a tap class

invoke

method is called. I think invoke is what

meltano invoke

and

meltano run

both start. I am guessing you are envisioning something like

meltano --environment prod invoke tap-mssql--client --schema-only

. That way you can script out the creation of the template reports which I am guessing is kind of a half manual half scripted workflow at the moment. Am I still on a correct train of thought?

joshua_janicas

12/10/2024, 8:12 PM

100% yeah, a

--schema-only

kind of flag would be what i'm looking for

BuzzCutNorman

12/10/2024, 8:53 PM

Looking at

Stream.sync()

and

Tap.sync_all()

they are decorated with

@t.final

so I can not override them in

tap-mssql

. The next options for override would be

Stream.sync_batches

and

Stream.sync_records

. I will have to play around with this a little and see what comes of it. I am not sure off the top of my head how a target will deal with only schema messages.

❤️ 1

Edgar Ramírez (Arch.dev)

12/11/2024, 12:08 AM

By all means log an ~~issue~~ feature request in the SDK repo!

joshua_janicas

12/11/2024, 11:58 AM

I will do so this week. 😊 Also thanks for taking a deep dive Norman.

BuzzCutNorman

12/11/2024, 5:37 PM

A quick update as a test I over wrote

Stream.sync()

in tap-mssql's

MSSQLStream

class to yield a blank dictionary

{}

which allowed tap-mssql to run and only emit SCHEMA and STATE messages. Sent that to a target and it happily accepted the schema and state messages only and created blank tables.

💯 1

🪄 1

5 Views

Open in Slack

Previous Next