My `meltano yml` is getting very large is there a recommende Meltano #best-practices

My `meltano.yml` is getting very large, is there a...

Ellis Valentiner

05/15/2024, 12:54 PM

meltano.yml

is getting very large, is there a recommended practice for breaking it up into smaller files that can be referenced/imported?

haleemur_ali

05/15/2024, 1:09 PM

you can split up your meltano plugin specification across multiple files. In the main file use include_paths like below

Copy code

include_paths:
  - "./plugin_definitions1/*.meltano.yml"
  - "./plugin_definitions2/*.meltano.yml"

Ellis Valentiner

05/15/2024, 1:11 PM

thanks!

Ellis Valentiner

05/15/2024, 1:15 PM

it looks like there's currently no way to use

include_paths

within a plugin?

haleemur_ali

05/15/2024, 1:16 PM

what do you mean?

Ellis Valentiner

05/15/2024, 1:18 PM

for example, replacing the

select:

within my extractor with a reference to a file

Edgar Ramírez (Arch.dev)

05/15/2024, 3:32 PM

Hi @Ellis Valentiner! > replacing the

select:

within my extractor with a reference to a file That last bit is not currently possible, but I'd like to explore the option of publishing a pkl module for programmatically building meltano.yml, so a user could split their project and reuse as much they'd need

haleemur_ali

05/15/2024, 4:20 PM

Hmm. Could you help me understand the use case for looking up the select block from a different file.

Ellis Valentiner

05/15/2024, 5:33 PM

We are replicating dozens of tables, most of which have dozens of columns. We can use the

table.*

syntax for some, but would prefer to declaratively list each column to be included/excluded. We want don't want to include new columns automatically for data privacy reasons and we've encountered problems replicating when columns are added to the underlying tables. Listing each column leads to a very large

meltano.yml

file. Currently we have only 1 extractor and 1 loader configuration but we expect to change this as well. We'd prefer to store information in separate files (with some sort of directory structure) that could be imported and reused in different

select

blocks for different extractor/loader tasks.

haleemur_ali

05/15/2024, 5:36 PM

I see. So would the ergonomic goal be ability to specify each table in its own file?

Ellis Valentiner

05/15/2024, 5:45 PM

Yes. I've been thinking about this and one of the challenges we've had in developing our ELT is that adding new tables requires us updating the

stream_maps

select

, and

metadata

blocks of our extractor. I can see use cases where that would be desirable but we'd rather have a single, centralized place to manage these.

Ellis Valentiner

05/15/2024, 5:46 PM

I mean that we would like to model the entity/table we are replicating in a single place, so that the stream maps, selects, and metadata configuration were all stored together.

Edgar Ramírez (Arch.dev)

05/15/2024, 6:01 PM

I think at this point your best option for that would be to use a proper programming language to generate your

meltano.yml

. In the long term, either a pkl module or a proper client (in Python or any language) generated from the JSON schema seems like the best solution, rather than adding complexity to Meltano's yaml parsing and resolution capabilities. Do log an issue if you'd like to see something like that or even something else entirely 🙂

Siddu Hussain

05/17/2024, 1:11 AM

@Edgar Ramírez (Arch.dev) - when you mean use programming language to generate Meltano.yml . Do you mean split the file into extractor base and merge them into 1 when running .

Edgar Ramírez (Arch.dev)

05/17/2024, 1:26 AM

I mean using something like Python to generate the desired objects and serializing them to yaml

👍 1

Matt Menzenski

05/29/2024, 12:28 AM

It’s possible to use YAML anchors and aliases in meltano.yml files. I’ve used this to reduce duplicated config before (although, updates made by meltano to the YAML file will expand them all, which made it awkward to manage).

👍 1

Edgar Ramírez (Arch.dev)

05/29/2024, 12:33 AM

fwiw I did get around to trying pkl for modularizing `meltano.yml`: https://github.com/edgarrmondragon/meltano-dogfood/tree/main/pkl.

ty 1

Siddu Hussain

06/28/2024, 2:45 PM

exactly what I was trying to figure out.

46 Views

Open in Slack

Previous Next