Hi everyone! I'm setting meltano up for the first...
# troubleshooting
e
Hi everyone! I'm setting meltano up for the first time to do some postgres->snowflake ELT. First table is bailing because we have a string that ends in a backslash which is escaping the field-delimiting comma, and causing a misalignment of columns..... line looks like this:
Copy code
2021-04-21 21:01:08.8029+00:00,\\N,Missing videos/pictures. \,1619038896,2021-04-21 21:01:36.6531+00:00,\\N,aderg,584960088,1619034428704,\\N
Do have I have manually escape characters?
The error is
Copy code
target-snowflake | target_snowflake.exceptions.SnowflakeError: ('Exception writing records', 100038 (22018): Numeric value '2021-04-21 21:01:36.6531+00:00' is not recognized
I'm kind of shocked to see no escaping at all in target-snowflake's CSVs.....
t
that does seem like a target error. Are you using the default loader? https://meltano.com/plugins/loaders/snowflake.html
e
We have tried all 3 variants of the target-snowflake
The meltano one does not seem to have this issue, but it uses INSERT which means there is a max batch size of 16k
t
I actually just got a DM on twitter from Teej about this. We’ll take an MR to update it to a MERGE statement 😄
@tj_murphy 👆
e
@taylor Hi! I work with TJ! Are you referring to the default loader or the meltano loader?
t
The Meltano one
we don’t control the DataMill or Transferwise ones. But we can update the Meltano one or add an additional fork that implements merge too
e
Gotcha, I think think you'll want
COPY INTO
to avoid the 16k row limit issue. I haven't looked at the meltano loader's code yet... I'm working on fixing the issue I'm running into with the datamill one right now. Do you have any background info on the differences of these variants and why datamill is the meltano default?
t
@douwe_maan might know better the differences (though I think the Settings availability gives a rough guide of the differences). The Data Mill one is actively used by a bunch of people which is why I think it was set as the default.
t
I'm open (and would be excited) to stress test and help push fixes upstream to whichever target we end up using, just not totally sure which target makes the most sense for us to use as a base.
d
Yeah, the DataMill and Transferwise variants both predate Meltano and are running in production in various places. The Transferwise variant is used by PipelineWise and particularly popular for doing DB-to-DB syncs with taps also maintained by the Transferwise/PipelineWise team, while DataMill became the de-facto standard to use with arbitrary Singer taps for APIs. We built the Meltano one at some point before I joined the team, but would rather not maintain too many taps/targets outselves so we're directing users to the variant that has seen most usage with arbitrary taps: DataMill's
I'd suggest contributing to the DataMill variant. Its primary maintainer @andrew_madonna also hangs out here 🙂
e
Cool... I seem to have a fix for the escaping problem that was breaking the datamill variant for us. And I also fixed its buggy detection on whether it is configured to use S3 or not.