Dumb(ish) question: newlines in the source data sh...
# troubleshooting
t
Dumb(ish) question: newlines in the source data should be encoded to \n by the tap, right? Or they have to be, really, since the Singer format is JSON? Is there then any way for the target to accurately distinguish between the string "\n" and a carriage return? My specific scenario, if it matters, is MySQL to Postgres. varchars containing newlines in MySQL are being stored with literal "\n"s in PG. From testing with
meltano invoke tap-mysql
it looks like the newline is being written to the JSON as "\n" though, so I don't think the issue is with the target but with the tap. I actually think it may be the way the data is being read from MySQL and not the conversion to JSON, i.e. dumping the 'row' object to the log shows the \n, but my Python-fu is not strong enough for me to be sure I'm interpreting all this correctly. 😕
v
JSON can handle newlines, it's escaped like
Copy code
{
  "newlinedata": "FirstLine\\nSecond Line"
}
Specefic scenario does matter 🙂 If you could post an example SCHEMA and then RECORD message from singer then we could all run it against our favorite targets as well and debug with you
Yes it does work!
t
Thanks @visch. Sounds like the issue is with the way tap-mysql (pipelinewise variant, BTW) is reading the data back from the DB... I think PyMySQL is returning a string with the characters "\n" in it so the tap is never actually seeing a newline character... and thus it doesn't properly get escaped in the JSON.
v
I've never seen that happen but good luck! I'd say look at the actual record being sent to see if that's what's going on
t
For the curious: this is caused by PyMySQL. It replaces the newline that it reads from the DB with the string "\n". facepalm A literal "\n" in the string gets replaced to "\\n", so I think it's technically possible to distinguish a newline from the string "\n", but... it's annoying extra processing that I shouldn't have to do. 😕