Hi So we are having some issues with a failing jo...
# troubleshooting
v
Hi So we are having some issues with a failing job. Every morning at 04.00 we are trying to run a
--full-refresh
and i do get this error since about a week ago. The error takes about 8 minutes to occur. The next time it starts it runs, it is without the full refresh and it ends up running for about 15 minutes resulting in the same error. After 3-5 retries it finally "catches up" and manages to run the whole job.
Copy code
2023-01-19 05:09:06.508 CET
Run invocation could not be completed as block failed: Another 'prod:tap-postgres-to-target-bigquery' pipeline is already running which started at 2023-01-19 04:02:29.710681. To ignore this check use the '--force' option.
2023-01-19 05:09:06.597 CET
Client closed local connection on 127.0.0.1:5432
2023-01-19 05:09:06.903 CET
Another 'prod:tap-postgres-to-target-bigquery' pipeline is already running which started at 2023-01-19 04:02:29.710681. To ignore this check use the '--force' option.
2023-01-19 05:09:06.903 CET
Block run completed.
2023-01-19 05:09:07.777 CET
Received TERM signal. Waiting up to 0s before terminating.
So today i went digging in the database i fund the "runs" log, excellent. "Fun fact" is that there is no job starting between the first 4 am job (sorry for timezone difference) and when it dies
Copy code
started   started_at                    ended_at                      last_heartbeat_at
SUCCESS,  2023-01-19 12:01:08.914036,   2023-01-19 12:02:13.786125,   2023-01-19 12:02:12.945424
SUCCESS,  2023-01-19 11:45:09.444758,   2023-01-19 11:46:08.446500,   2023-01-19 11:46:08.381997
SUCCESS,  2023-01-19 11:31:07.112609,   2023-01-19 11:32:07.804656,   2023-01-19 11:32:06.933957
SUCCESS,  2023-01-19 11:28:38.761015,   2023-01-19 11:29:40.734920,   2023-01-19 11:29:40.691249
SUCCESS,  2023-01-19 10:16:09.787328,   2023-01-19 11:27:04.277290,   2023-01-19 11:27:04.179192
FAIL,     2023-01-19 08:30:38.460719,   2023-01-19 10:16:09.712441,   2023-01-19 09:58:09.176293
FAIL,     2023-01-19 08:02:44.723517,   2023-01-19 08:30:38.385950,   2023-01-19 08:17:06.586861
FAIL,     2023-01-19 04:02:29.710681,   2023-01-19 08:02:44.642756,   2023-01-19 04:08:56.203939
Any suggestions on how to find where the mysterious state comes from?