Has anyone run into the issue where their source pulls out a Meltano #troubleshooting

Has anyone run into the issue where their source p...

josh_lloyd

08/20/2021, 11:39 PM

Has anyone run into the issue where their source pulls out a field from a column that is too large for the target? I’m trying to set up a pipeline from tap-postgres to target-snowflake. One of the columns in postgres contains strings that are larger than can be ingested by snowflake as evidenced by this error:

Copy code

snowflake.connector.errors.ProgrammingError: 100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is 113279832

I need that data, unfortunately, and fields of that size get created in the source relatively regularly. I could be satisfied with just a truncated version of those values, though. Does anyone know of a remedy short of doing some fancy modifications to the tap?

pablo_seibelt

08/23/2021, 1:54 PM

I never tried it, but maybe you can modify the target to set that column to be a variant? According to Snowflake docs, avariant can store up to 16mb compressed

pablo_seibelt

08/23/2021, 1:55 PM

Alternatively, if it can be split somehow (e.g. by newlines) you could load it as different rows instead

josh_lloyd

08/23/2021, 1:56 PM

I’ll have to take a look to make sure that it’s bringing it in as a variant (I think it is) but even then some of these are as large as 51MB

pablo_seibelt

08/23/2021, 1:56 PM

e.g. you could make your ETL separate it into an array and set it as an array of strings on your tap, that'll load it as a new table with a row for each item

josh_lloyd

08/23/2021, 1:56 PM

the new line approach might be something I can look into! Better than all the other ideas I could think of

pablo_seibelt

08/23/2021, 1:58 PM

You'll have to make your own fork of tap-postgres to do it, i think

josh_lloyd

08/23/2021, 1:59 PM

you think it’s better to do it on the tap or target? I suppose the tap would make sense, it would help me avoid setting the

elt.buffer_size

to some absurdly high number

pablo_seibelt

08/23/2021, 2:13 PM

It should be on the tap IMHO

Open in Slack

Previous Next