I m running into an issue with the `target bigquery` loader Meltano #troubleshooting

I'm running into an issue with the `target-bigquer...

jacob_mulligan

04/12/2023, 8:09 PM

I'm running into an issue with the

target-bigquery

loader where it hangs after reaching a certain point. I'm creating a new

users

table in BigQuery inside a dataset that already exists, the loader is successfully creating the table (but not with any non-generic columns) but not creating any rows inside the table. Here are the logs from when I run

meltano run tap-healthie target-bigquery

(tap-healthie is a custom extractor I've written):

Copy code

Environment 'dev' is active
Beginning full_table sync of 'users'... 
Tap has custom mapper. Using 1 provided map(s). 
METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.453015, "tags": {"stream": "users", "endpoint": "", "http_status_code": 200, "status": "succeeded"}} 
METRIC: {"type": "counter", "metric": "http_request_count", "value": 1, "tags": {"stream": "users", "endpoint": ""}} 
METRIC: {"type": "timer", "metric": "sync_duration", "value": 0.5560169219970703, "tags": {"stream": "users", "context": {}, "status": "succeeded"}} 
METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"stream": "users", "context": {}}} 
Using thread-based parallelism 
Target 'target-bigquery' is listening for input from tap. 
Initializing 'target-bigquery' target sink... 
Initializing target sink for stream 'users'... 
Setting up users 
Target 'target-bigquery' completed reading 4 lines of input (2 records, (0 batch manifests, 1 state messages). 
Adding worker 3486bcdbbde445e683c5c1447695f005 
google.api_core.bidi | Thread-ConsumeBidirectionalStream exiting

This hangs for at last 10min on the

google.api_core.bidi | Thread-ConsumeBidirectionalStream exiting

step before I kill the job. Any ideas for why the bigquery tap would hang like this?

jacob_mulligan

04/12/2023, 8:10 PM

Here's my config in meltano.yml:

Copy code

- name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    config:
      credentials_json: ${BIGQUERY_CREDENTAILS_JSON}
      project: analytics-prod-383519
      dataset: source_healthie
      flattening_enabled: True
      flattening_max_depth: 1
      location: US

alexander_butler

04/13/2023, 2:46 AM

Try running the pipeline again like this:

TARGET_BIGQUERY_DEBUG=true meltano --log-level=debug run tap-healthie target-bigquery

We should be able to figure it out pretty quick.

jacob_mulligan

04/13/2023, 8:20 AM

Copy code

2023-04-13T08:20:11.368555Z [info     ] grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-04-13T08:20:11.368662Z [info     ]         status = StatusCode.PERMISSION_DENIED cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-04-13T08:20:11.368764Z [info     ]         details = "Permission 'TABLES_UPDATE_DATA' denied on resource 'projects/analytics-prod-383519/datasets/source_healthie/tables/users': Streaming insert is not allowed in free tier." cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery

🎯

jacob_mulligan

04/13/2023, 8:21 AM

I changed the

method

config to

batch_job

and the data is now loading. thanks Alex!

jacob_mulligan

04/13/2023, 9:38 AM

However, even with

flattening_enabled: True

the source data is not being flattened. Do you know why? Currently all the the data loaded into BigQuery is in a

data

column as JSON.

alexander_butler

04/13/2023, 4:24 PM

that is because of the

denormalized

setting (down a bit in this table) https://github.com/z3z1ma/target-bigquery#settings It is false by default which means the target wraps everything into a data column to support any tap regardless of schema quality or stability. Set it to true to denormalize into independent columns

jacob_mulligan

04/18/2023, 2:00 PM

i just realized you're a contributor to the target-bigquery package (based on this?), thank you for creating and maintaining this and for the help!! 🙇

jacob_mulligan

04/18/2023, 2:11 PM

I was able to set

denormalized: true

for 2 extractors which create typed columns explicitly, which is great. However, I tried using this target to load data from

pipelinewise-tap-pogstgres

just now and am running into this error from the bigquery tap: ```2023-04-18 161024,644 | INFO | target-bigquery | Initializing 'target-bigquery' target sink... 2023-04-18 161024,644 | INFO | target-bigquery | Initializing target sink for stream 'public-Prescription'... 2023-04-18 161025,131 | INFO | root | HERE: 2023-04-18 161025,131 | INFO | root | {'$ref': '#/definitions/sdc_recursive_number_array'} 2023-04-18 161025,131 | INFO | root | None Traceback (most recent call last): File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/bin/target-bigquery", line 8, in <module> sys.exit(TargetBigQuery.cli()) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 578, in cli target.listen(file_input) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 34, in listen self._process_lines(file_input) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 278, in _process_lines counter = super()._process_lines(file_input) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 78, in _process_lines self._process_schema_message(line_dict) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 378, in _process_schema_message _ = self.get_sink( File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/target.py", line 472, in get_sink return self.add_sink(stream_name, schema, key_properties) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 240, in add_sink sink = sink_class( File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/batch_job.py", line 101, in init super().__init__(*args, **kwargs) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/core.py", line 293, in init self.create_target(key_properties=key_properties) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 289, in wrapped_f return self(f, *args, **kw) File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 379, in call do = self.iter(retry_state=retry_state) File "/Users/jacob/code/miga/analytics/etl/.meltano…

jacob_mulligan

04/18/2023, 2:27 PM

can you also explain what the

flattening_enabled

config does? i assumed flatten did what

denormalized

did prior to this thread..

Open in Slack

Previous Next