Hi we have setup a pipeline for snapchat to bigquery and fin Meltano #troubleshooting

Hi, we have setup a pipeline for snapchat-to-bigqu...

mohammad_alam

04/13/2022, 5:05 AM

Hi, we have setup a pipeline for snapchat-to-bigquery and finally its working and records are getting copied in bigquery. In this pipeline we are getting issues when trying to send data for schema

creatives

and

campaign

and the issue display in log file, please have a look below and let us know how we can solve this error so we can include these schemas as well: ```2022-04-12T124726.221664Z [info ] INFO Stream audience_segments, batch processed 6 records cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1c5e-0a68-4da3-bfd2-08c1a8fb8acf stdio=stderr 2022-04-12T124726.221941Z [info ] INFO Synced Stream: audience_segments, page: 1, records: 1 to 6 cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1c5e-0a68-4da3-bfd2-08c1a8fb8acf stdio=stderr 2022-04-12T124726.222196Z [info ] INFO Write state for Stream: audience_segments, ad_account ID: b9770de9-90e7-4f79-b952-31ce66c44570, value: 2022-03-11T174605.784000Z cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snap> 2022-04-12T124726.222451Z [info ] INFO FINISHED Sync for Stream: audience_segments, parent_id: b9770de9-90e7-4f79-b952-31ce66c44570, total_records: 6 cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf> 2022-04-12T124726.222704Z [info ] INFO START Syncing: creatives cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1c5e-0a68-4da3-bfd2-08c1a8fb8acf stdio=stderr 2022-04-12T124726.222977Z [debug ] {"type": "STATE", "value": {"currently_syncing": "organizations", "bookmarks": {"funding_sources": {"updated_at(parent_organization_id:5a044e78-3367-4f4b-91f5-54a603db56b2)": "2022-03-11T192037.> 2022-04-12T124726.223400Z [debug ] {"type": "SCHEMA", "stream": "creatives", "schema": {"properties": {"id": {"type": ["null", "string"]}, "updated_at": {"format": "date-time", "type": ["null", "string"]}, "created_at": {"format": > 2022-04-12T124726.223698Z [info ] INFO START Sync for Stream: creatives, parent_stream: ad_accounts, parent_id: b9770de9-90e7-4f79-b952-31ce66c44570 cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1> 2022-04-12T124726.223868Z [info ] INFO timezone = America/Los_Angeles cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1c5e-0a68-4da3-bfd2-08c1a8fb8acf stdio=stderr 2022-04-12T124726.224060Z [info ] INFO START Sync for Stream: creatives cmd_type=extractor job_id=snapchat-to-bigquery name=tap-snapchat-ads run_id=b5cf1c5e-0a68-4da3-bfd2-08c1a8fb8acf stdio=stderr 2022-04-12T124726.224263Z [info ] INFO Updating state with {'currently_syncing': 'organizations', 'bookmarks': {'funding_sources': {'updated_at(parent_organization_id:5a044e78-3367-4f4b-91f5-54a603db56b2)': '2022-03-11T192037.90> 2022-04-12T124726.224433Z [info ] INFO creatives schema: {'properties': {'id': {'type': ['null', 'string']}, 'updated_at': {'format': 'date-time', 'type': ['null', 'string']}, 'created_at': {'format': 'date-time', 'type': ['null',> 2022-04-12T124726.224623Z [info ] WARNING the pipeline might fail because of undefined fields: an empty object/dictionary indicated as {} cmd_type=loader job_id=snapchat-to-bigquery name=target-bigquery run_id=b5cf1c5e-0a68-4da3-b> 2022-04-12T124726.557134Z [info ] INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.33484792709350586, "tags": {"endpoint": "creatives", "http_status_code": 200, "status": "succeeded"}} cmd_type=extracto> 2022-04-12T124726.558867Z [debug ] {"type": "RECORD", "stream": "creatives", "record": {"id": "0b5196a1-0a21-4c8e-87fe-6ce9b26d0dd6", "updated_at": "2022-03-11T193224.797000Z", "created_at": "2022-03-11T192052.320000Z", "name":> 2022-04-12T124726.564066Z [info ] CRITICAL 'RECORD' cmd_type=loader job_id=snapchat-to-bigquery name=target-bigquery run_id=b5cf1c5e-0a68…

pat_nadolny

04/13/2022, 2:44 PM

Hey @mohammad_alam ! Are you able to run this snapchat sync fully without the target? You can try something like

meltano --log-level=debug invoke tap-snapchat > output.json

to write the data to a file. I'm trying to understand if the issue is related to extracting data from snapchat or loading data to BQ

mohammad_alam

04/15/2022, 4:35 AM

Hi @pat_nadolny thanks for your reply. I checked using the command you shared, it seems tap extracted the

campaigns

records too since I searched it and found it in the

output.json

file. Please have a look

output.json

mohammad_alam

04/18/2022, 3:43 AM

I would appreciate if you come back at this point please @pat_nadolny

pat_nadolny

04/18/2022, 1:33 PM

@mohammad_alam I'm not totally sure whats going on here, its definitely something with the target. I'm able to run your output to target-jsonl so its not a json schema validation issue. I personally havent used big query so I'm not able to help test though. Theres nothing obviously wrong with the schema or record when I looked at them, you can try

cat output.json | meltano --log-level=debug invoke target-bigquery

to confirm that exact output is failing (not sure if that gets you anywhere).

pat_nadolny

04/18/2022, 1:33 PM

Could this be related https://github.com/adswerve/target-bigquery/issues/32? Hey @ruslan_bergenov 👋 I saw you were on the issue - any ideas if this is related?

pat_nadolny

04/18/2022, 1:33 PM

cc @edgar_ramirez_mondragon @aaronsteers in case you might see something I dont

ruslan_bergenov

04/18/2022, 4:54 PM

@mohammad_alam, @pat_nadolny, the target gives a warning that the json schema is not complete, it contains an instance of empty properties indicated as {}.

Copy code

2022-04-12T12:47:26.224623Z [info     ] WARNING the pipeline might fail because of undefined fields: an empty object/dictionary indicated as {} cmd_type=loader job_id=snapchat-to-bigquery name=target-bigquery run_id=b5cf1c5e-0a68-4da3-b>

Later the target tries to convert each JSON field to a BigQuery field. To do that, the target needs to know what data type to convert JSON data type to. Because data types are not specified in JSON schema, the target fails. Our recommendation is to make sure JSON schema is complete, each field doesn't have instances of empty object/dictionary {}, but has actual data types specified. After that, we can pass JSON schema tap-catalog.json file during the sync.

pat_nadolny

04/18/2022, 7:10 PM

@ruslan_bergenov Thanks for the response! Very helpful. Again im not super familiar with BQ or the target, I know that its a best practice to define the schema in full so data is consistent but theres also been discussion about having targets include a data type failsafe i.e. string, where possible. If I understand the issue here, having a failsafe would allow data to be loaded and not fail with the downside of the data type being a more generic string vs the exact data type. What do you think about that? Am I understanding the problem properly? Is that possible with BQ?

ruslan_bergenov

04/18/2022, 8:07 PM

@pat_nadolny, What do you think about that? Is that possible with BQ? Yes, that makes sense. We will add this to a wish list/roadmap to target-bigquery. It should be possible with BQ. Am I understanding the problem properly? Yes, I think you do. 🙂

pat_nadolny

04/18/2022, 9:23 PM

@ruslan_bergenov awesome, that would be ideal! I created an issue in the repo to track it https://github.com/adswerve/target-bigquery/issues/35 cc @edgar_ramirez_mondragon @aaronsteers

ruslan_bergenov

04/18/2022, 9:25 PM

@pat_nadolny, thank you for submitting an issue! 👍

Open in Slack

Previous Next