Hi, I've been stuck with a consistent error runnin...
# troubleshooting
j
Hi, I've been stuck with a consistent error running
meltano run tap-postgres target-bigquery
or similar command. This error isn't too helpful. Looks like ORM session is lost during streaming. I tried increasing buffer and a bunch other tweaks. Any direction is appreciated
c
All those colourful boxes and squares and codesections actually are not what you would need to look for. Basically what you need to do is scroll up all the way to the beggining of these colourful box output and then look at the text lines preceding those coloured boxes. (A bit unintuitive, I know) I think you can disable these couloured boxes by setting
NO_COLOR=1
as a shell variable now with meltano 2.10 and higher.
p
It also looks like it’s a problem with the extractor so you can try running it with invoke to get more details. You can also try log level debug to get more logs
j
Copy code
time=2022-11-29 02:21:41 name=singer level=INFO message=METRIC: {"type": "counter", "metric": "record_count", "value": 21186, "tags": {}}
time=2022-11-29 02:21:41 name=tap_postgres level=CRITICAL message=connection already closed
Traceback (most recent call last):
  File "/project/.meltano/extractors/tap-postgres/venv/lib/python3.9/site-packages/tap_postgres/sync_strategies/full_table.py", line 144, in sync_table
    for rec in cur:
  File "/project/.meltano/extractors/tap-postgres/venv/lib/python3.9/site-packages/psycopg2/extras.py", line 120, in __iter__
    yield next(res)
psycopg2.OperationalError: terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
SSL connection has been closed unexpectedly
meltano --log-level=debug invoke tap-postgres
did reveal more of what happened; not sure what it implies though
c
You are reading from a hot standby replica. You'll need to try and avoid query conflicts.
j
Thank you so much Christoph. Definitely learnt something today. I am most likely going to see what my options are. Primary can impact replica is not something I can rely on. This is a GCP CloudSQL snapshot, there might be some other options of replica there. Thank you again
c
This is a GCP CloudSQL snapshot.
Ah. Interesting. I'm not familiar with google's version. They may have options to configure the postgres WAL delay settings on their replicas. Alex from harness uses GCP, but I think mainly BigQuery. He may have some input.
A quick scan of the instance creation options doesn't really show anything where you would configure WAL delays .... https://cloud.google.com/sdk/gcloud/reference/sql/instances/create
j
That's what I'm finding too
And I don't want to hit Primary for ELT regularly
c
I wonder what the difference between
FAILOVER
and
READ
replica-type is though ... it's the most suspiciously looking option ... https://cloud.google.com/sdk/gcloud/reference/sql/instances/create#--replica-type
Doesn't really provide a lot of details ... 🤷 https://cloud.google.com/sql/docs/postgres/replication/create-replica
Consider adjusting the
max_standby_archive_delay
and
max_standby_streaming_delay
flags for your replica.
Seems like it's possible somehow in Cloud SQL
j
That's what it sounds like.
c
I think it's this 'instance flags' thing ... the CLI doesn't seem to expose a dedicated command to adjust the flags. Might just be in the API or GUI Console. https://cloud.google.com/sdk/gcloud/reference/sql/flags/list
Ey. Found it ...
gcloud sql instances patch
https://cloud.google.com/sql/docs/postgres/flags#gcloud
j
You rock dude. Thank you so much
c
I am now Google Cloud certified. 😂
I have used GCP many many years ago though. So that helps.
j
I just ran a job and it's working
c
Amazing!
j
After that, Loader was failing and it turns out OOMKilled on the container it ran on; bumped it to double, the manual jobs are finishing properly. Now I'm faced with
Copy code
[2022-12-01 14:35:14,964] {dagbag.py:496} INFO - Filling up the DagBag from /project/orchestrate/dags/meltano.py
[2022-12-01 14:35:19,042] {meltano.py:193} INFO - Received meltano v2 style schedule export: {'schedules': {'job': [{'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}, {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}], 'elt': []}}
[2022-12-01 14:35:19,043] {meltano.py:146} INFO - Considering task 'tap-postgres target-bigquery' of schedule 'postgres-to-bq-schedule': {'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}
[2022-12-01 14:35:19,045] {meltano.py:165} INFO - Spun off task '<Task(BashOperator): meltano_postgres-to-bq-schedule_tap-postgres-to-bigquery_task0>' of schedule 'postgres-to-bq-schedule': {'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}
[2022-12-01 14:35:19,045] {meltano.py:170} INFO - DAG created for schedule 'postgres-to-bq-schedule', task='tap-postgres target-bigquery'
[2022-12-01 14:35:19,046] {meltano.py:146} INFO - Considering task 'tap-postgres target-ds-bigquery' of schedule 'postgres-to-ds-bq-schedule': {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}
[2022-12-01 14:35:19,046] {meltano.py:165} INFO - Spun off task '<Task(BashOperator): meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0>' of schedule 'postgres-to-ds-bq-schedule': {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}
[2022-12-01 14:35:19,046] {meltano.py:170} INFO - DAG created for schedule 'postgres-to-ds-bq-schedule', task='tap-postgres target-ds-bigquery'
Running <TaskInstance: meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery.meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0 2022-12-01T12:30:00+00:00 [queued]> on host airflow-scheduler-deployment-b955b69fb-7dpjq
[2022-12-01 14:35:26,147] {scheduler_job.py:1218} INFO - Executor reports execution of meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery.meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0 execution_date=2022-12-01 12:30:00+00:00 exited with status success for try_number 2
[2022-12-01 14:35:30,423] {dagrun.py:429} ERROR - Marking run <DagRun meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery @ 2022-12-01 12:30:00+00:00: scheduled__2022-12-01T12:30:00+00:00, externally triggered: False> failed
I wonder if this has to do with spinning up new worker container to run the job? It's a helm chart I used to deployed this set up. https://gitlab.com/meltano/infra/helm-meltano/-/tree/master/airflow