Heya Is there some way to tell the Meltano Labs `target snow Meltano #getting-started

Heya! Is there some way to tell the Meltano Labs `...

janis_puris

07/03/2023, 7:05 PM

Heya! Is there some way to tell the Meltano Labs

target-snowflake

to batch for example 100k rows instead of 10k? I'm trying to replicate a table with 300M entries (which takes days to replicate with the pipelinewise variant), is there some way I can reduce the amount of log output with this variant? it is printing.. a ton 😞 edit: Also the SDK version seems to not emit state after each batch, which means if the network goes out mid run i.e. at 150M row, it will start from scratch next time?

joshua_janicas

07/04/2023, 12:34 PM

I would also be interested in an answer to this 👀

taylor

07/04/2023, 1:45 PM

Today’s a holiday in the US but I’ll ping @pat_nadolny and @visch on this. I’m surprised that State isn’t emitted regularly. We have https://github.com/meltano/sdk/issues/1626 to track the batch size

visch

07/05/2023, 12:46 AM

For target postgres State does get updated as each batch happens (95 percent sure) so I think the sdk handles this but maybe there's some edge case here that's being hit

janis_puris

07/05/2023, 7:15 AM

hmm.. I'm using pipelinewise oracle-tap and MeltanoLabs (SDK) target-snowflake. Maybe it be the combination of the plugins.. Later today I can try to reproduce this with a different tap, maybe an sdk one as well 🤷

user

07/05/2023, 2:36 PM

I'd want @edgar_ramirez_mondragon or @ken_payne to give their input too ~~but it looks like the SDK in fact does not emit state during a sync right now, to me thats also unexpected~~. I'm writing up an issue with my understanding of whats happening so we can discuss, I'll post it here shortly

user

07/05/2023, 3:05 PM

@janis_puris I dug into this a bit and described what I found in this discussion https://github.com/meltano/sdk/discussions/1808. If anyone has additional thoughts or if I got anything wrong please add it to the discussion

user

07/05/2023, 3:07 PM

The TLDR is that by default every 5 mins the target should attempt to drain records and emit any state messages it received. Knowing that info, it makes me wonder how often the tap is emitting state messages. If the tap doesnt regularly emit state message then the target cant do much

Open in Slack

Previous Next