jerry_deng
11/29/2022, 1:14 AMmeltano run tap-postgres target-bigquery
or similar command. This error isn't too helpful. Looks like ORM session is lost during streaming. I tried increasing buffer and a bunch other tweaks. Any direction is appreciatedchristoph
11/29/2022, 1:34 AMNO_COLOR=1
as a shell variable now with meltano 2.10 and higher.pat_nadolny
11/29/2022, 2:17 AMjerry_deng
11/29/2022, 2:25 AMtime=2022-11-29 02:21:41 name=singer level=INFO message=METRIC: {"type": "counter", "metric": "record_count", "value": 21186, "tags": {}}
time=2022-11-29 02:21:41 name=tap_postgres level=CRITICAL message=connection already closed
Traceback (most recent call last):
File "/project/.meltano/extractors/tap-postgres/venv/lib/python3.9/site-packages/tap_postgres/sync_strategies/full_table.py", line 144, in sync_table
for rec in cur:
File "/project/.meltano/extractors/tap-postgres/venv/lib/python3.9/site-packages/psycopg2/extras.py", line 120, in __iter__
yield next(res)
psycopg2.OperationalError: terminating connection due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
SSL connection has been closed unexpectedly
meltano --log-level=debug invoke tap-postgres
did reveal more of what happened; not sure what it implies thoughchristoph
11/29/2022, 3:21 AMchristoph
11/29/2022, 3:22 AMjerry_deng
11/29/2022, 3:53 AMchristoph
11/29/2022, 3:57 AMThis is a GCP CloudSQL snapshot.Ah. Interesting. I'm not familiar with google's version. They may have options to configure the postgres WAL delay settings on their replicas. Alex from harness uses GCP, but I think mainly BigQuery. He may have some input.
christoph
11/29/2022, 4:01 AMjerry_deng
11/29/2022, 4:01 AMjerry_deng
11/29/2022, 4:01 AMchristoph
11/29/2022, 4:02 AMFAILOVER
and READ
replica-type is though ... it's the most suspiciously looking option ... https://cloud.google.com/sdk/gcloud/reference/sql/instances/create#--replica-typechristoph
11/29/2022, 4:04 AMchristoph
11/29/2022, 4:08 AMchristoph
11/29/2022, 4:09 AMConsider adjusting theSeems like it's possible somehow in Cloud SQLandmax_standby_archive_delay
flags for your replica.max_standby_streaming_delay
jerry_deng
11/29/2022, 4:09 AMchristoph
11/29/2022, 4:12 AMchristoph
11/29/2022, 4:15 AMgcloud sql instances patch
https://cloud.google.com/sql/docs/postgres/flags#gcloudchristoph
11/29/2022, 4:16 AMjerry_deng
11/29/2022, 4:16 AMchristoph
11/29/2022, 4:16 AMchristoph
11/29/2022, 4:16 AMjerry_deng
11/29/2022, 4:39 AMchristoph
11/29/2022, 4:51 AMjerry_deng
12/01/2022, 3:00 PM[2022-12-01 14:35:14,964] {dagbag.py:496} INFO - Filling up the DagBag from /project/orchestrate/dags/meltano.py
[2022-12-01 14:35:19,042] {meltano.py:193} INFO - Received meltano v2 style schedule export: {'schedules': {'job': [{'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}, {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}], 'elt': []}}
[2022-12-01 14:35:19,043] {meltano.py:146} INFO - Considering task 'tap-postgres target-bigquery' of schedule 'postgres-to-bq-schedule': {'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}
[2022-12-01 14:35:19,045] {meltano.py:165} INFO - Spun off task '<Task(BashOperator): meltano_postgres-to-bq-schedule_tap-postgres-to-bigquery_task0>' of schedule 'postgres-to-bq-schedule': {'name': 'postgres-to-bq-schedule', 'interval': '0 0/2 * * *', 'cron_interval': '0 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-bigquery', 'tasks': ['tap-postgres target-bigquery']}}
[2022-12-01 14:35:19,045] {meltano.py:170} INFO - DAG created for schedule 'postgres-to-bq-schedule', task='tap-postgres target-bigquery'
[2022-12-01 14:35:19,046] {meltano.py:146} INFO - Considering task 'tap-postgres target-ds-bigquery' of schedule 'postgres-to-ds-bq-schedule': {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}
[2022-12-01 14:35:19,046] {meltano.py:165} INFO - Spun off task '<Task(BashOperator): meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0>' of schedule 'postgres-to-ds-bq-schedule': {'name': 'postgres-to-ds-bq-schedule', 'interval': '30 0/2 * * *', 'cron_interval': '30 0/2 * * *', 'env': {}, 'job': {'name': 'tap-postgres-to-ds-bigquery', 'tasks': ['tap-postgres target-ds-bigquery']}}
[2022-12-01 14:35:19,046] {meltano.py:170} INFO - DAG created for schedule 'postgres-to-ds-bq-schedule', task='tap-postgres target-ds-bigquery'
Running <TaskInstance: meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery.meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0 2022-12-01T12:30:00+00:00 [queued]> on host airflow-scheduler-deployment-b955b69fb-7dpjq
[2022-12-01 14:35:26,147] {scheduler_job.py:1218} INFO - Executor reports execution of meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery.meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery_task0 execution_date=2022-12-01 12:30:00+00:00 exited with status success for try_number 2
[2022-12-01 14:35:30,423] {dagrun.py:429} ERROR - Marking run <DagRun meltano_postgres-to-ds-bq-schedule_tap-postgres-to-ds-bigquery @ 2022-12-01 12:30:00+00:00: scheduled__2022-12-01T12:30:00+00:00, externally triggered: False> failed
I wonder if this has to do with spinning up new worker container to run the job? It's a helm chart I used to deployed this set up. https://gitlab.com/meltano/infra/helm-meltano/-/tree/master/airflow