jacob_mulligan
04/12/2023, 8:09 PMtarget-bigquery
loader where it hangs after reaching a certain point. I'm creating a new users
table in BigQuery inside a dataset that already exists, the loader is successfully creating the table (but not with any non-generic columns) but not creating any rows inside the table.
Here are the logs from when I run meltano run tap-healthie target-bigquery
(tap-healthie is a custom extractor I've written):
Environment 'dev' is active
Beginning full_table sync of 'users'...
Tap has custom mapper. Using 1 provided map(s).
METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.453015, "tags": {"stream": "users", "endpoint": "", "http_status_code": 200, "status": "succeeded"}}
METRIC: {"type": "counter", "metric": "http_request_count", "value": 1, "tags": {"stream": "users", "endpoint": ""}}
METRIC: {"type": "timer", "metric": "sync_duration", "value": 0.5560169219970703, "tags": {"stream": "users", "context": {}, "status": "succeeded"}}
METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"stream": "users", "context": {}}}
Using thread-based parallelism
Target 'target-bigquery' is listening for input from tap.
Initializing 'target-bigquery' target sink...
Initializing target sink for stream 'users'...
Setting up users
Target 'target-bigquery' completed reading 4 lines of input (2 records, (0 batch manifests, 1 state messages).
Adding worker 3486bcdbbde445e683c5c1447695f005
google.api_core.bidi | Thread-ConsumeBidirectionalStream exiting
This hangs for at last 10min on the google.api_core.bidi | Thread-ConsumeBidirectionalStream exiting
step before I kill the job. Any ideas for why the bigquery tap would hang like this?jacob_mulligan
04/12/2023, 8:10 PM- name: target-bigquery
variant: z3z1ma
pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
config:
credentials_json: ${BIGQUERY_CREDENTAILS_JSON}
project: analytics-prod-383519
dataset: source_healthie
flattening_enabled: True
flattening_max_depth: 1
location: US
alexander_butler
04/13/2023, 2:46 AMTARGET_BIGQUERY_DEBUG=true meltano --log-level=debug run tap-healthie target-bigquery
We should be able to figure it out pretty quick.jacob_mulligan
04/13/2023, 8:20 AM2023-04-13T08:20:11.368555Z [info ] grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-04-13T08:20:11.368662Z [info ] status = StatusCode.PERMISSION_DENIED cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-04-13T08:20:11.368764Z [info ] details = "Permission 'TABLES_UPDATE_DATA' denied on resource 'projects/analytics-prod-383519/datasets/source_healthie/tables/users': Streaming insert is not allowed in free tier." cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
🎯jacob_mulligan
04/13/2023, 8:21 AMmethod
config to batch_job
and the data is now loading.
thanks Alex!jacob_mulligan
04/13/2023, 9:38 AMflattening_enabled: True
the source data is not being flattened. Do you know why?
Currently all the the data loaded into BigQuery is in a data
column as JSON.alexander_butler
04/13/2023, 4:24 PMdenormalized
setting (down a bit in this table)
https://github.com/z3z1ma/target-bigquery#settings
It is false by default which means the target wraps everything into a data column to support any tap regardless of schema quality or stability.
Set it to true to denormalize into independent columnsjacob_mulligan
04/18/2023, 2:00 PMjacob_mulligan
04/18/2023, 2:11 PMdenormalized: true
for 2 extractors which create typed columns explicitly, which is great.
However, I tried using this target to load data from pipelinewise-tap-pogstgres
just now and am running into this error from the bigquery tap:
```2023-04-18 161024,644 | INFO | target-bigquery | Initializing 'target-bigquery' target sink...
2023-04-18 161024,644 | INFO | target-bigquery | Initializing target sink for stream 'public-Prescription'...
2023-04-18 161025,131 | INFO | root | HERE:
2023-04-18 161025,131 | INFO | root | {'$ref': '#/definitions/sdc_recursive_number_array'}
2023-04-18 161025,131 | INFO | root | None
Traceback (most recent call last):
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/bin/target-bigquery", line 8, in <module>
sys.exit(TargetBigQuery.cli())
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 578, in cli
target.listen(file_input)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 34, in listen
self._process_lines(file_input)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 278, in _process_lines
counter = super()._process_lines(file_input)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 78, in _process_lines
self._process_schema_message(line_dict)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 378, in _process_schema_message
_ = self.get_sink(
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/target.py", line 472, in get_sink
return self.add_sink(stream_name, schema, key_properties)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 240, in add_sink
sink = sink_class(
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/batch_job.py", line 101, in init
super().__init__(*args, **kwargs)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/core.py", line 293, in init
self.create_target(key_properties=key_properties)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/jacob/code/miga/analytics/etl/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/tenacity/__init__.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/jacob/code/miga/analytics/etl/.meltano…jacob_mulligan
04/18/2023, 2:27 PMflattening_enabled
config does? i assumed flatten did what denormalized
did prior to this thread..