Hello, I'm getting this error: `Error parsing JSON...
# troubleshooting
i
Hello, I'm getting this error:
Error parsing JSON: UTF-16 single low surrogate: 0xdc81
when trying to run my mssql tap. Has anyone encountered this?
👀 1
h
Could you share what column type is giving this error (in the database, as well as what's being discovered in the json schema)
i
This error didn't point at any particular column - I only got the 'merge' query on the CLI output which I'm assuming is where it failed
Although the column types seem to be consistent between what's in my source db and what's in my target db
After a bunch of troubleshooting I've found the specific column and the datum that was causing the error, and after decompressing and translating to utf-16, a PO number was a bunch of Japanese text for some reason which was too big for UTF-8. So either I guess I need to leave out that column or explicitly make my table utf-16?
h
which variant of tap-mssql are you using?
i
buzzcutnorman
After looking into the source db I think it's just misinterpreting a hyphen. Character in source db: '-', Decompressed JSON from meltano for target-snowflake: '\u00c2\udc81'
h
have you tried any of the other variants. Maybe the issue impacts only this one and not the other variants. guessing from what you've shared so far - it might be an issue with a dependency, e.g. the specific library used to encode/decode json, or the library used for compression, and not necessarily a bug in the plugin tap-mssql itself
Copy code
a PO number was a bunch of Japanese text for some reason which was too big for UTF-8
utf-8 and utf16 should both be able to represent the same characters (unicode). would you be able to create a bug report documenting this behaviour so the plugin maintainers can reproduce it (sample create table statement with sample data)
i
I'm going to try the other variants and see if those run into the same issue
h
Hey @Ian OLeary, wondering if you found a workaround for this and what that was. Also, if you could create a small error case. (e.g. a table with sample value that causes this error), I'd be happy to take a look this week. Thanks.
i
The workaround for now is to just leave the column causing the error out of the pipeline since we don't really need it. To be honest though, I'm not sure how to create an error case.
h
To be honest though, I'm not sure how to create an error case.
Happy to walk you through this. Essentially, the bug report should mention the following: 1. source system & version 2. sql script with a create table statement & an insert statement (see below) 3. tap version 4. minimal meltano.yml config (be careful not to leak any credentials or sensitive info there) 5. sample command that results in error e.g.
meltano run tap-mssql target-jsonl
6. error output (just the traceback should be fine, again please redact anything sensitive) sample sql script:
Copy code
create table test_table (id integer, test_column <MY_TEST_TEXT_TYPE_&_ENCODING_CAUSING_ERROR);
insert into test_table (1, 'text value that will error out'), (2, 'text value that is handled properly');
✅ 1
i
@haleemur_ali should I create this bug report in the tap-mssql/Issues and put the details there?
h
That'd be ideal. If I can find time, I'll definitely look into it, but creating an issue there allows others to have visibility & take this on as well.