Running into an issue where a `meltano run` comman...
# troubleshooting
m
Running into an issue where a
meltano run
command is failing after seemingly successfully performing the requested extraction and DBT step. The error below is shown without any context. We are using a postgres cluster for both the meltano state and extracted/dbt tables.
Copy code
... normal output above ...
2023-05-01T23:56:10.389373Z [info     ] 23:56:10  Finished running 21 view models, 1 table model, 4 incremental models in 0 hours 8 minutes and 22.61 seconds (502.61s). cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.424722Z [info     ] 23:56:10                   cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.426245Z [info     ] 23:56:10  Completed successfully cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.427779Z [info     ] 23:56:10                   cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.429591Z [info     ] 23:56:10  Done. PASS=26 WARN=0 ERROR=0 SKIP=0 TOTAL=26 cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.644030Z [info     ]                                cmd_type=command name=dbt-postgres stdio=stderr
2023-05-01T23:56:10.644438Z [info     ]                                cmd_type=command name=dbt-postgres stdio=stderr
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

(psycopg2.OperationalError) SSL connection has been closed unexpectedly

(Background on this error at: <https://sqlalche.me/e/14/e3q8>)
It looks like our postgres connection was dropped for whatever reason. I don’t see our postgres cluster as having any relevant timeouts that I think could be causing this, however (I only see a transaction idle timeout of 60s). Any ideas on how to debug what’s going on here?
a
Looks like the Meltano system db, trying to write the job history perhaps? You could use a local system db to narrow down the problem. If the error goes away then you know it’s the job history write. Assuming it is the job history and the connection being closed by Postgres- you could prove this by running a job with a utility that just sleeps for 8 minutes. (that appears to be how long your dbt transforms are running)
m
thanks for the suggestion. It appears that it actually was the idle transaction timeout triggering the issue. I did a few more test runs and could see a connection from meltano that was stuck in the “idle in transaction” state until disappearing. After removing that timeout it seems things are working. That said, this seems like something that meltano shouldn’t be doing.
a
Interesting. I’m confident a bug report would be very welcome. We use a few elephant sql instances, where there’s no idle timeout. Was yours a self hosted postgres?
m
It’s hosted on AWS RDS as an Aurora PostgreSQL cluster. I think the setting I started with were the default settings on AWS.
a
Good to know!
m
Just looked through our configs, it looks like we explicitly set the timeout and it’s not the default
a
Oops. Short transactions, and releasing resources asap, is a good idea for OLTP. Can’t think of much benefit for OLAP.