Hey guys, I'm getting an error when testing a melt...
# troubleshooting
h
Hey guys, I'm getting an error when testing a meltano tap I'm creating and am getting this error:
requests.exceptions.JSONDecodeError: Extra data: line 1 column 11 (char 10)
The output of the API Request is in CSV format and I can see that this is a JSON error, so I was wondering if there is anything I need to change so the tap works for CSV outputs instead of JSON.
a
Can you post the full stack trace? Some more context needed.
Are you using the Meltano SDK? The RestStream will expect your response to be json-formatted so you might need to override some methods to achieve what you want,
h
certainly! It's a little long so I'll attach it as a txt file.
I am using RestStream I believe, do you know what I need to override to pull data from a CSV output? so far I have only change my parse_response function to look like this:
Copy code
def parse_response(self, response: requests.Response) -> Iterable[dict]:
        """Parse the response and return an iterator of result records.

        Args:
            response: The HTTP ``requests.Response`` object.

        Yields:
            Each record from the source.
        """
        response_content = response.content.decode('utf-8')
        csv_file = io.StringIO(response_content)
        reader = csv.DictReader(csv_file)

        for row in reader:
            yield row
Hey just an update. I realised that when I run the tap with target-jsonl, I get the error in the terminal but the jsonl file is still being created with the correct data inside.
a
So maybe the issue is the key:
MissingKeyPropertiesError
Have you defined some custom streams? What key_properties do they have?
Copy code
2024-08-30T10:11:20.288260Z [info     ]     raise MissingKeyPropertiesError( cmd_type=elb consumer=True job_name=test:tap-staffwise-to-target-csv name=target-csv producer=False run_id=a602a416-f781-407a-b357-53b5cc832a13 stdio=stderr string_id=target-csv
2024-08-30T10:11:20.288507Z [info     ] singer_sdk.exceptions.MissingKeyPropertiesError: Record is missing one or more key_properties. cmd_type=elb consumer=True job_name=test:tap-staffwise-to-target-csv name=target-csv producer=False run_id=a602a416-f781-407a-b357-53b5cc832a13 stdio=stderr string_id=target-csv
2024-08-30T10:11:20.288972Z [info     ] Key Properties: ['Shift ID'], Record Keys: ['Location', 'Location code', 'Salesforce ID', 'Type', 'Region', 'Date', 'Submission date', 'Staff', 'Parent Question', 'Question', 'Subject', 'Response'] cmd_type=elb consumer=True job_name=test:tap-staffwise-to-target-csv name=target-csv producer=False run_id=a602a416-f781-407a-b357-53b5cc832a13 stdio=stderr string_id=target-csv
Is 'Shift ID' present for the csv? I would debug on the
yield row
line and check to see the formation of the dict, and if it matches your schema
And also this looks sus firther up. You might have a parsing issue
Copy code
2024-08-30T10:11:20.280025Z [info     ] 2024-08-30 11:11:20,277 | WARNING  | tap-staffwise.reporting | Properties ('\ufeff"Shift ID"',) were present in the 'reporting' stream but not found in catalog schema. Ignoring. cmd_type=elb consumer=False job_name=test:tap-staffwise-to-target-csv name=tap-staffwise producer=True run_id=a602a416-f781-407a-b357-53b5cc832a13 stdio=stderr string_id=tap-staffwise
h
I have "Shift ID" in my key properties, also in the jsonl file that generates, it is missing the Shift ID and I think it's due to the present of the "\ufeff" before "Shift ID" so it's being ignored because it doesn't match with the column name, I fixed this by changing
.decode(utf-8)
to
.decode(utf-8-sig)
in the parse_response function. I have also resolved the MissingKeyPropertiesError. I am still getting the error
requests.exceptions.JSONDecodeError: Extra data: line 1 column 11 (char 10)
throughout my logs though. I've included the trace in a file with this message. Might also be some other errors in there too.
e
How did you define pagination for your stream?
You may either want to remove
next_page_token_jsonpath
(and maybe
records_jsonpath
for good measure)
And then define actual pagination, if any, for your stream
h
Amazing, I removed next_page_token_jsonpath and the errors have gone Although, I don't fully understand what pagination is since I'm pretty much a beginner in this but I'll look into it. But the errors have gone for now and the block runs to completion, so thanks for your help!