hey all, i am getting this error when running `mel...
# getting-started
k
hey all, i am getting this error when running
meltano elt tap-essensys target-snowflake
:
Copy code
2023-04-24T17:58:01.963480Z [info     ] snowflake.connector.errors.ProgrammingError: 100069 (22P02): Error parsing JSON: unknown keyword "Thu", pos 4 cmd_type=loader name=target-snowflake run_id=a967ed57-d5e1-4fbb-930b-decf4b71045e state_id=2023-04-24T172243--tap-essensys--target-snowflake stdio=stderr
2023-04-24T17:58:02.739440Z [error    ] Loading failed                 code=1 message=snowflake.connector.errors.ProgrammingError: 100069 (22P02): Error parsing JSON: unknown keyword "Thu", pos 4 name=meltano run_id=a967ed57-d5e1-4fbb-930b-decf4b71045e state_id=2023-04-24T172243--tap-essensys--target-snowflake
any ideas here?
p
@kevin can you share a bit more about what target variant youre using and is tap-essensys a custom tap? I dont see it on the hub.
The reference to parsing JSON is interesting to me. That makes me think theres invalid json, maybe unescaped characters. https://community.snowflake.com/s/question/0D50Z00008SN2zqSAD/parsing-json-unknown-keyword-pos-0
k
hey @pat_nadolny! i am using :
Copy code
- name: target-snowflake
    variant: transferwise
    pip_url: pipelinewise-target-snowflake
and yes the tap essensys is a custom tap hosted on a private repo. I tried to create a file format to combat this but to no avail:
Copy code
CREATE FILE FORMAT RAW_MELTANO_DB.ESSENSYS.CSV
    TYPE = 'CSV'
    ESCAPE='\\'
    FIELD_OPTIONALLY_ENCLOSED_BY='\"' (tried '"' and "@")
    ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE;
p
Have you tried running
meltano invoke tap-essensys
to verify the output is in fact valid json?
k
getting this (still running i assume) . Doesnt look off though
so more defined logs from the elt.log file generated:
Copy code
2023-04-25T19:07:08.006789Z [info     ] snowflake.connector.errors.ProgrammingError: 100069 (22P02): Error parsing JSON: unknown keyword "Thu", pos 4 cmd_type=loader name=target-snowflake run_id=808b6e0e-301c-48fb-8e60-6e4f913b175d state_id=2023-04-25T184956--tap-essensys--target-snowflake stdio=stderr
2023-04-25T19:07:08.143548Z [error    ] Extraction failed              code=1 message=singer_sdk.exceptions.FatalAPIError: 401 Client Error: Access Denied for path: /api/1.0/Calendar name=meltano run_id=808b6e0e-301c-48fb-8e60-6e4f913b175d state_id=2023-04-25T184956--tap-essensys--target-snowflake
2023-04-25T19:07:08.143756Z [error    ] Loading failed                 code=1 message=snowflake.connector.errors.ProgrammingError: 100069 (22P02): Error parsing JSON: unknown keyword "Thu", pos 4 name=meltano run_id=808b6e0e-301c-48fb-8e60-6e4f913b175d state_id=2023-04-25T184956--tap-essensys--target-snowflake
2023-04-25T19:07:08.144243Z [info     ] ELT could not be completed: Extractor and loader failed. cmd_type=elt name=meltano run_id=808b6e0e-301c-48fb-8e60-6e4f913b175d state_id=2023-04-25T184956--tap-essensys--target-snowflake stdio=stderr
so the authentication failure came after. not sure if this was triggered cause of the json parsing .
when i run
meltano invoke tap-essensys --discover
i get back all the tables
elt.log
p
interesting, it does seem like it could be a tap problem. I'd probably try to run
meltano invoke tap-essensys > output.json
to save the singer messages to a file. Then once it fails because of the
FatalAPIError: 401 Client Error
you can look at the last couple of records in output.json to see what they look like, maybe invalid? I'd also specifically search in the output,json for the records containing
Thu
because it might become obvious
Also any hints about the private tap would be helpful. Is it doing anything complex or just using something like a RESTStream?
k
hey @pat_nadolny actually i dont think it would be a problem to send you the tap source code it is not really anything complicated.
looks standard, no complicated filters
p
yeah that looks pretty standard to me. I'd spend some time inspecting the raw singer output to narrow down the issue and if that doesnt work then maybe target-jsonl might help to detect invalid json 🤷
k
yeah for sure, i ran
meltano invoke tap-essensys > output.json
and it succeeded
interesting so
meltano elt tap-essensys target-jsonl
succeeded
which leads me to believe perhaps it has to do with the variant used for snowflake
p
I've been using the transferwise variant for a long time and have never ran into this but just to say the target is usually pretty good. When you open up the output.json or the target-jsonl files and look for
Thu
do you find any weird? If you find a sample that you can share that might be helpful
k
hey @pat_nadolny nothing weird, when i search for Thu, it looks like it is part of other strings
p
@kevin I wonder if the commas inside the double quotes are ending up being treated as delimiters when theyre being loaded to snowflake. I'm browsing the code really quick but while I'm doing that - have you gotten any data to load? You might want to try cherry picking a few records and confirm that the
"Thu"
isnt a red herring for something else. Sometimes snowflake gives unhelpful errors
k
I am able to get the accounts table to load but somewhere I. The calendars table is where is halts. I’ll take a look again
p
Thats good - see if you can cherry pick some calendar records and get them to load. You might be able to narrow it down from there
k
Thu seems to be part of a first name or last name . Nothing really stands out at first glance, will continue to look
p
I'm also confused why it would say
Error parsing JSON
if youre file format is CSV. Have you double checked that its using the file format you think its using?
k
Ah I think I set file format in Snowflake to csv
k
yeah i pretty much have the same thing in snowflake:
wonder if i have to change the file format
to json
p
No the target only support CSV (or parquet as experimental) https://github.com/transferwise/pipelinewise-target-snowflake#pre-requirements
when I show my format options field this is what I have, in case theres differences:
Copy code
{
  "TYPE": "CSV",
  "RECORD_DELIMITER": "\n",
  "FIELD_DELIMITER": ",",
  "FILE_EXTENSION": null,
  "SKIP_HEADER": 0,
  "DATE_FORMAT": "AUTO",
  "TIME_FORMAT": "AUTO",
  "TIMESTAMP_FORMAT": "AUTO",
  "BINARY_FORMAT": "HEX",
  "ESCAPE": "\\",
  "ESCAPE_UNENCLOSED_FIELD": "\\",
  "TRIM_SPACE": false,
  "FIELD_OPTIONALLY_ENCLOSED_BY": "\"",
  "NULL_IF": [
    "\\N"
  ],
  "COMPRESSION": "AUTO",
  "ERROR_ON_COLUMN_COUNT_MISMATCH": true,
  "VALIDATE_UTF8": true,
  "SKIP_BLANK_LINES": false,
  "REPLACE_INVALID_CHARACTERS": false,
  "EMPTY_FIELD_AS_NULL": true,
  "SKIP_BYTE_ORDER_MARK": true,
  "ENCODING": "UTF8"
}
k
yeah seems identical only difference is that i have:
Copy code
"ERROR_ON_COLUMN_COUNT_MISMATCH":false
Copy code
{
  "TYPE": "CSV",
  "RECORD_DELIMITER": "\n",
  "FIELD_DELIMITER": ",",
  "FILE_EXTENSION": null,
  "SKIP_HEADER": 0,
  "DATE_FORMAT": "AUTO",
  "TIME_FORMAT": "AUTO",
  "TIMESTAMP_FORMAT": "AUTO",
  "BINARY_FORMAT": "HEX",
  "ESCAPE": "\\",
  "ESCAPE_UNENCLOSED_FIELD": "\\",
  "TRIM_SPACE": false,
  "FIELD_OPTIONALLY_ENCLOSED_BY": "\"",
  "NULL_IF": [
    "\\N"
  ],
  "COMPRESSION": "AUTO",
  "ERROR_ON_COLUMN_COUNT_MISMATCH": false,
  "VALIDATE_UTF8": true,
  "SKIP_BLANK_LINES": false,
  "REPLACE_INVALID_CHARACTERS": false,
  "EMPTY_FIELD_AS_NULL": true,
  "SKIP_BYTE_ORDER_MARK": true,
  "ENCODING": "UTF8"
}
p
Have you been able to narrow in on a few of the records that are causing the failure? If its a bug it would be good to have a reproducable example
k
no luck so far, there are only two instances of Thu both
"firstname": "Thu"
first name is a string of any val