Heya! Is there some way to tell the Meltano Labs `...
# getting-started
j
Heya! Is there some way to tell the Meltano Labs
target-snowflake
to batch for example 100k rows instead of 10k? I'm trying to replicate a table with 300M entries (which takes days to replicate with the pipelinewise variant), is there some way I can reduce the amount of log output with this variant? it is printing.. a ton 😞 edit: Also the SDK version seems to not emit state after each batch, which means if the network goes out mid run i.e. at 150M row, it will start from scratch next time?
j
I would also be interested in an answer to this 👀
t
Today’s a holiday in the US but I’ll ping @pat_nadolny and @visch on this. I’m surprised that State isn’t emitted regularly. We have https://github.com/meltano/sdk/issues/1626 to track the batch size
v
For target postgres State does get updated as each batch happens (95 percent sure) so I think the sdk handles this but maybe there's some edge case here that's being hit
j
hmm.. I'm using pipelinewise oracle-tap and MeltanoLabs (SDK) target-snowflake. Maybe it be the combination of the plugins.. Later today I can try to reproduce this with a different tap, maybe an sdk one as well 🤷
u
I'd want @edgar_ramirez_mondragon or @ken_payne to give their input too but it looks like the SDK in fact does not emit state during a sync right now, to me thats also unexpected. I'm writing up an issue with my understanding of whats happening so we can discuss, I'll post it here shortly
u
@janis_puris I dug into this a bit and described what I found in this discussion https://github.com/meltano/sdk/discussions/1808. If anyone has additional thoughts or if I got anything wrong please add it to the discussion
u
The TLDR is that by default every 5 mins the target should attempt to drain records and emit any state messages it received. Knowing that info, it makes me wonder how often the tap is emitting state messages. If the tap doesnt regularly emit state message then the target cant do much