I’m using the transferwise snowflake target, and h...
# troubleshooting
d
I’m using the transferwise snowflake target, and having issues with really large json blobs. I’m bringing data in from a postgres table with a jsonb column, where there are entries that are > 16MB. The size limit for both the variant & varchar types in snowflake is 16MB. Can anyone think of a workaround for this? There are only a few rows where the size is so large, I’d be happy with truncating them or maybe even excluding those rows. Is this something I could configure in meltano, or is it something I’d need to figure out with the target?
a
Hi, @dean_morin. No perfect ideas come to mind. Options I can think of: 1. Fork the tap or target to make this fail gracefully. 2. Override the tap schema in the hopes that a forced shorter data type length will coerce the data to be gracefully truncated. (Unclear if this would be fatal, depends on implementation details in the tap and target.) 3. Deselect the column. 4. Use a stream map transformation. (Unfortunately this probably won't work because we don't have the standalone mapper built yet, but this would work if either the Tap or Target was SDK-based.)
5. Create a view in your upstream (along with some type of casting/truncating) and pull from the view instead of the raw table.
d
That last one is a really good thought, probably the easiest way of going about it…
How do I know if a tap/target is SDK-based? Would those have been developed from the get-go using the SDK?
Oh, although it’s a really big table, so a view probably won’t work because I’d need to do a full extraction every time 🤔
Man, those stream maps look really useful, is making them a standalone thing a near term thing? Maybe I could wait for that…
a
You should be able to use the same incremental replication key and primary key behaviors on a view as on a table, although both have to be set manually in Meltano.yml. Only log-based replication does not work on views, as far as I know.
And thanks for the feedback on Stream Maps! We're extremely excited to bring these to "regular" taps and targets. This feature is in the 1-3 months' horizon. (Being intentionally non-committal to not overpromise. 😅 ) Cc @taylor, as we were just discussing prioritization earlier today.
To your last question, @taylor and @pat_nadolny just announced a new flag on the #C01UGBSJNG5 to show if a tap or target is built on the SDK. (Although I can speak from memory that Postgres and Snowflake are not yet ported.)
d
Thanks! I think I tried incremental loading with a view a few months back, but probably din’t set the keys as you mentioned. Yeah, the filtering in stream maps looks awesome, but so the ability to strip out PII!