Has anyone run into the issue where their source p...
# troubleshooting
j
Has anyone run into the issue where their source pulls out a field from a column that is too large for the target? I’m trying to set up a pipeline from tap-postgres to target-snowflake. One of the columns in postgres contains strings that are larger than can be ingested by snowflake as evidenced by this error:
Copy code
snowflake.connector.errors.ProgrammingError: 100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is 113279832
I need that data, unfortunately, and fields of that size get created in the source relatively regularly. I could be satisfied with just a truncated version of those values, though. Does anyone know of a remedy short of doing some fancy modifications to the tap?
p
I never tried it, but maybe you can modify the target to set that column to be a variant? According to Snowflake docs, avariant can store up to 16mb compressed
Alternatively, if it can be split somehow (e.g. by newlines) you could load it as different rows instead
j
I’ll have to take a look to make sure that it’s bringing it in as a variant (I think it is) but even then some of these are as large as 51MB
p
e.g. you could make your ETL separate it into an array and set it as an array of strings on your tap, that'll load it as a new table with a row for each item
j
the new line approach might be something I can look into! Better than all the other ideas I could think of
p
You'll have to make your own fork of tap-postgres to do it, i think
j
you think it’s better to do it on the tap or target? I suppose the tap would make sense, it would help me avoid setting the
elt.buffer_size
to some absurdly high number
p
It should be on the tap IMHO