Hello - I was able to setup meltano and also start...
# troubleshooting
k
Hello - I was able to setup meltano and also started the etl by running “meltano elt tap-postgres target-bigquery --job_id=postgres-to-bigquery-testing” - I am running it in screen and I can see the output…however I dont see anything in my bigquery yet.
v
by see the output, could you post it?
re https://meltano.slack.com/archives/C01TCRBBJD7/p1619636124003600 , as soon as it runs it ends up in big query
d
Correct, once
meltano elt
finishes the data should be in BigQuery. Like Derek mentioned, seeing the command output may help us figure out what's wrong.
Are you looking in the right dataset in BigQuery? As mentioned under https://meltano.com/plugins/loaders/bigquery.html#dataset-id, by default a new dataset is created named after the tap, in your case that'd be
tap_postgres
k
this is the output, but I dont see anything in my bigquery
On the UI
In the Service account usage logs, I can see that there are entries for Insert job., Job completed and then Delete and Delete table.
d
@krishna_chaitanya Can you show some more of the log output that starts with
target-bigquery
? That may give us an indication of what's happening
k
Here is the output and last few lines in the log:
Copy code
target-bigquery       | INFO Copy t_public-agency_block_reason_a4a6a379b1364bd496b5dc04e27d3def to public-agency_block_reason
target-bigquery       | INFO Pushing state: {'bookmarks': {'public-agency_block_reason': {'last_replication_method': 'FULL_TABLE', 'version': 1619691790096, 'xmin': None}}, 'currently_syncing': None}
target-bigquery (out) | {"bookmarks": {"public-agency_block_reason": {"last_replication_method": "FULL_TABLE", "version": 1619691790096, "xmin": null}}, "currently_syncing": null}
meltano               | INFO Incremental state has been updated at 2021-04-29 10:23:18.458198.
meltano               | DEBUG Incremental state: {'bookmarks': {'public-agency_block_reason': {'last_replication_method': 'FULL_TABLE', 'version': 1619691790096, 'xmin': None}}, 'currently_syncing': None}
meltano               | DEBUG Deleted configuration at /root/meltano-projects/demo/.meltano/run/elt/2021-04-29T102255--tap-postgres--target-bigquery/5fdabc9c-eace-456d-9f90-54eede94f5b0/target.config.json
meltano               | DEBUG Deleted configuration at /root/meltano-projects/demo/.meltano/run/elt/2021-04-29T102255--tap-postgres--target-bigquery/5fdabc9c-eace-456d-9f90-54eede94f5b0/tap.config.json
meltano               | INFO Extract & load complete!
meltano               | INFO Transformation skipped.
d
OK, so
target-bigquery
is claiming the last thing it did was copying from a temporary table to
public-agency_block_reason
, which is coming from https://github.com/adswerve/target-bigquery/blob/master/target_bigquery/processhandler.py#L155. After that, it deletes the temporary table: https://github.com/adswerve/target-bigquery/blob/master/target_bigquery/processhandler.py#L172, which matches with the "Delete" you mentioned seeing in the usage logs. From what we're seeing in the target-bigquery output, it successfully loaded the data into BigQuery, and this is a pretty popular target so we can be reasonably confident it actually works correctly. Can you show me the target-bigquery config that's used in your pipeline? You can get that using
meltano elt tap-postgres target-bigquery --job_id=postgres-to-bigquery-testing --dump=loader-config
. You can of course remove anything sensitive
k
@douwe_maan here it is…
Copy code
{
 "project_id": "Actual_Project_ID",
 "dataset_id": "meltano_postgres_nh",
 "location": "EU",
 "credentials_path": "/root/meltano-projects/demo/bi-pipeline-meltano-service-account-bigquery-sync.json",
 "validate_records": true,
 "add_metadata_columns": true,
 "replication_method": "append",
 "max_cache": 50
}
d
OK. And you're definitely looking in the correct project and dataset?
Just to be sure, are you seeing any
RECORD
messages show up in the logs with a
tap-postgres (out)
prefix? If not, then the tap isn't actually extracting any records which would also explain why we're not seeing anything in BQ 🙂
k
Copy code
tap-postgres         |                                      FROM "public"."agency_block_reason"
tap-postgres         |                                     ORDER BY xmin::text ASC with itersize 20000
tap-postgres (out)   | {"type": "RECORD", "stream": data data data}
tap-postgres (out)   | {"type": "RECORD", "stream": data data data}
tap-postgres (out)   | {"type": "RECORD", "stream": data data data}
tap-postgres (out)   | {"type": "RECORD", "stream": "data data data}
replaced the content with data data data
d
OK, so that's working.
And yet the
meltano_postgres_nh
dataset is completely empty? In the usage logs, do you see which table are being created/inserted and deleted?
k
yes from the service account logs i can see the tables that are created and deleted - they match
d
We figured it out, or at least part of it: the
project_id
was set incorrectly because the BigQuery interface is confusing
k
@douwe_maan You are simply awesome. Thanks for pointing out the issue so quickly. Issue is fixed.
d
Awesome!