We are using `target redshift` and it s creating `VARCHAR 10 Meltano #troubleshooting

We are using `target-redshift` and it’s creating `...

Matt Menzenski

02/03/2023, 9:13 PM

We are using

target-redshift

and it’s creating

VARCHAR(10000)

columns — I have a column that contains bigger data than that and want that column to be crated as

VARCHAR(MAX)

. The target appears to support this https://github.com/transferwise/pipelinewise-target-redshift/blob/master/target_redshift/db_sync.py#L17-L19 but I can’t figure out how to configure that for a column, or whether it’s possible to pass in that configuration from meltano. Any ideas?

aaronsteers

02/03/2023, 9:50 PM

👀 https://github.com/transferwise/pipelinewise-target-redshift/blob/master/target_redshift/db_sync.py#L52-L53

Matt Menzenski

02/03/2023, 9:50 PM

right, but where do I set that?

aaronsteers

02/03/2023, 9:50 PM

Looking...

Matt Menzenski

02/03/2023, 9:53 PM

it seems like it’s expecting the varchar length to be on the Singer

SCHEMA

messages?

Matt Menzenski

02/03/2023, 9:54 PM

I do see an open issue from two years ago https://github.com/transferwise/pipelinewise-target-redshift/issues/74

aaronsteers

02/03/2023, 9:54 PM

Yes, specifically via the JSON Schema spec for that column https://json-schema.org/understanding-json-schema/reference/string.html#length

aaronsteers

02/03/2023, 9:54 PM

What tap are you using?

Matt Menzenski

02/03/2023, 9:55 PM

I’m using a bunch of taps but the specific one that’s having this issue is tap-spreadsheets-anywhere

Matt Menzenski

02/03/2023, 9:55 PM

for JSONL data from AWS S3

aaronsteers

02/03/2023, 9:57 PM

Okay, thanks. So, I think you have two options... Option 1 would be to update the tap-spreadsheets-anywhere to send something like a

maxLength

property in the JSON Schema of its discovered catalog (which then also gets sent in the

SCHEMA

message. Option 2 (probably easier but might take some debugging) would be to override the data type using the

schema

config within your Meltano.yml entry for that tap. https://docs.meltano.com/concepts/plugins#how-to-use-3

aaronsteers

02/03/2023, 9:58 PM

Copy code

extractors:
- name: tap-spreadsheet-anywhere
  schema:
    some_stream_id:
      my_long_col:
        type: ["string", "null"]
        maxLength: 99999

Something like this 👆 UPDATED to fix typos

Matt Menzenski

02/03/2023, 9:58 PM

ooh, I think I remember reading about that (option 2) but had forgotten about it

Matt Menzenski

02/03/2023, 9:58 PM

thanks, will try that!!

aaronsteers

02/03/2023, 9:59 PM

👍 Great! Let us know how it goes.

Matt Menzenski

02/06/2023, 7:33 PM

update: this works perfectly 🦜

Copy code

schema:
              platform_data:
                mongodocument:
                  type:
                    - null
                    - string
                  maxLength: 1000000

aaronsteers

02/06/2023, 11:08 PM

Great to hear! Thanks!

Open in Slack

Previous Next