Another question about the meltano labs variant of `target s Meltano #troubleshooting

Another question about the meltano labs variant of...

Anthony Shook

03/31/2025, 9:34 PM

Another question about the meltano labs variant of

target-snowflake

— I recently switched from the

transferwise

variant, and am getting massively decreased performance. A little about the job — it is an upsert on an

updated_at

key for a table that, for every run of Meltano, adds or updates around 1.5 million rows (it’s one of many tables in the config, but is by far the long pole in the tent). • In the

transferwise

variant, with

batch_size_rows

set to 500000, I’d typically see three merge statements, each taking about 8 minutes for a total of 24 minutes. ◦ I could look at the query execution and see that the number of rows inserted/updated added up to around 500k each time, reflecting the batch output. • Before adding some auto-clustering to the table, the

meltanolabs

variant, with

batch_size_rows

still set to 500000, results in hundreds of merge statements, each of exactly 10,000 rows ◦ Each of these took about 2 minutes, so the total is anywhere from 260-300 minutes, a massive increase in processing time on the server (and in cost, in snowflake) ◦ This makes me think the batch_size_rows parameter is somehow being ignored, but I’m not sure how or why.

✅ 1

Anthony Shook

03/31/2025, 9:38 PM

for environment context, two things to note: 1. I did not change taps, in both cases I was using the meltanolabs variant of tap-postgres 2. I’m using the meltano hosted docker image

meltano/meltano:v3.5.4-python3.12

and my target version is

target-snowflake v0.15.1.post5+20e07d0, Meltano SDK v0.44.3

Edgar Ramírez (Arch.dev)

03/31/2025, 10:24 PM

Hi Anthony! 10,000 is indeed the default and

batch_size_rows

is seemingly being ignore. How are you configuring this for the target?

Anthony Shook

03/31/2025, 10:29 PM

Here’s the relevant section of the config, private stuff redacted: Command is

meltano run tap-source target-snowflake-trg1

Copy code

- name: target-snowflake
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/target-snowflake>
    config:
      account: XXXXX
      database: XXXXX
      user: XXXXX
      private_key_path: XXXXXX
      warehouse: XXXXX
      role: XXXXX
      add_record_metadata: true
      batch_size_rows: 500000
      load_method: upsert
  - name: target-snowflake-trg1
    inherit_from: target-snowflake
    config:
      default_target_schema: XXXXX

~~I’m going to try maybe setting it in the child loader that inherits.~~ (note, didn’t change the effect)

Anthony Shook

04/02/2025, 2:04 PM

Hey Edgar! Any other thoughts here? Should I just drop a bug report in the git issues?

BuzzCutNorman

04/02/2025, 2:49 PM

I found this thread in which someone resolved the batch size issue. It is older but it might still work. I m using target snowflake and had some questions around bat Meltano #singer-targets

Anthony Shook

04/02/2025, 2:54 PM

Nice, thanks, I’ll take a look at this!

Anthony Shook

04/02/2025, 4:06 PM

Okay, @BuzzCutNorman — great find! So, I can get meltano to respect that batch size with the following update to the config:

Copy code

- name: target-snowflake
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/target-snowflake>
    config:
      account: XXXXX
      database: XXXXX
      user: XXXXX
      private_key_path: XXXXXX
      warehouse: XXXXX
      role: XXXXX
      add_record_metadata: true
      batch_size_rows: 500000
      #----# BEGIN NEW #----#
      batch_config:
        encoding:
          format: jsonl
          compression: gzip
        storage:
          root: file://
        batch_size: 500000
      #----# END NEW #----#
      load_method: upsert
  - name: target-snowflake-trg1
    inherit_from: target-snowflake
    config:
      default_target_schema: XXXXX

You do have to explicitly set the encoding and storage, too, but just using the defaults is fine.

BuzzCutNorman

04/02/2025, 4:09 PM

That is great news that config combo worked 🥳. Thanks for clarifying that you need to set the encoding and storage as well.

Edgar Ramírez (Arch.dev)

04/02/2025, 4:25 PM

Thanks for sharing that @BuzzCutNorman! I opened https://github.com/MeltanoLabs/target-snowflake/pull/340 to fix the behavior.

Matt Menzenski

04/25/2025, 9:34 PM

just chiming in to say that this week we updated to latest target-snowflake and can confirm that

batch_size_rows

is being respected 🙂

🙌 2

👌 1

31 Views

Open in Slack

Previous Next