We are using `target-redshift` and it’s creating `...
# troubleshooting
m
We are using
target-redshift
and it’s creating
VARCHAR(10000)
columns — I have a column that contains bigger data than that and want that column to be crated as
VARCHAR(MAX)
. The target appears to support this https://github.com/transferwise/pipelinewise-target-redshift/blob/master/target_redshift/db_sync.py#L17-L19 but I can’t figure out how to configure that for a column, or whether it’s possible to pass in that configuration from meltano. Any ideas?
m
right, but where do I set that?
a
Looking...
m
it seems like it’s expecting the varchar length to be on the Singer
SCHEMA
messages?
a
Yes, specifically via the JSON Schema spec for that column https://json-schema.org/understanding-json-schema/reference/string.html#length
What tap are you using?
m
I’m using a bunch of taps but the specific one that’s having this issue is tap-spreadsheets-anywhere
for JSONL data from AWS S3
a
Okay, thanks. So, I think you have two options... Option 1 would be to update the tap-spreadsheets-anywhere to send something like a
maxLength
property in the JSON Schema of its discovered catalog (which then also gets sent in the
SCHEMA
message. Option 2 (probably easier but might take some debugging) would be to override the data type using the
schema
config within your Meltano.yml entry for that tap. https://docs.meltano.com/concepts/plugins#how-to-use-3
Copy code
extractors:
- name: tap-spreadsheet-anywhere
  schema:
    some_stream_id:
      my_long_col:
        type: ["string", "null"]
        maxLength: 99999
Something like this 👆 UPDATED to fix typos
m
ooh, I think I remember reading about that (option 2) but had forgotten about it
thanks, will try that!!
a
👍 Great! Let us know how it goes.
m
update: this works perfectly 🦜
Copy code
schema:
              platform_data:
                mongodocument:
                  type:
                    - null
                    - string
                  maxLength: 1000000
a
Great to hear! Thanks!