*target-postgres - can't encode character `\\udc81...
# troubleshooting
c
target-postgres - can't encode character
\\udc81
surrogates not allowed
I'm probably the only one, but sometimes the data I get is less than perfectly clean. There are some random characters that don't encode cleanly from unicode to utf-8. I'm using the
datamill-co
variant of
target-postgres
and it's using psycopg2 to talk to Postgres. I'd like to tell the loader to just ignore these errors, (perhaps replace it with a specific char I can find later), log the issue, and move on. I've configured the loader with
invalid_record_threshold=10
, but that doesn't seem to help. How do other folks deal with issues like this? Is there a way to configure the loader to ignore? Do you painstakingly pre-clean the data?
t
I've dealt with some errors similar to this (not exactly the same, but similar) using meltano-map-transformer to remove undesirables before it gets sent to the target.
c
The transformer looks interesting. Do you know if it's possible to run it in in "try/except" mode? Since my bad characters are infrequent, I'd like to avoid having the transformer in the pipeline all the time, but invoke it if there is an error. Is that possible? I'm not seeing a lot of examples.