We re running into an issue where meltano is failing because Meltano #troubleshooting

We're running into an issue where meltano is faili...

matt_cooley

08/28/2024, 4:23 PM

We're running into an issue where meltano is failing because of

SSL SYSCALL error: Connection timed out

and then

Copy code

During handling of the above exception, another exception occurred:
FileNotFoundError: [Errno 2] No such file or directory:
/project/.meltano/run/elt/...

Because of this the run isn't marked as failed. So when we try to run it again it will fail because meltano thinks that there is another run of the same job running. Has anyone dealt this this before? Any suggestions?

visch

08/28/2024, 4:27 PM

Need more context, full logs, what are you running when this happens etc etc

matt_cooley

08/28/2024, 5:04 PM

Sure thing: • we're running

meltano elt --state-id=some-id tap-postgres target-snowflake

◦ variants are transferwise • It's running on ECS using Fargate Containers Logs are attached. Thanks!

extract-2024-08-28T16_52_39.877Z.csv

visch

08/28/2024, 5:10 PM

I can guess with that info, the logs look like the server can't access the postgres database. More information we'd need is how is fargate actually getting called to run this. The postgres access thing is networking / iam roles etc potentially but it could be a number of things. Where I'd start is verifying those containers can access the database

➕ 1

Edgar Ramírez (Arch.dev)

08/28/2024, 7:07 PM

Yeah, I'd verify that the Meltano container can reach the db. I've logged https://github.com/meltano/meltano/issues/8733 too.

matt_cooley

08/28/2024, 10:24 PM

Fair enough. I guess I didn't mention that this is intermitent. It seems to happen randomly. Which seems to point to a networking issue, that causes a failure AND then won't allow that failure ot be logged because of the same networking issue. It looks like it's losing a connection to the meta-db in between when it creates the run (the id is logged in the run table) and actually starting starting the replication

👀 1

dean_morin

08/29/2024, 12:24 AM

I was working with Matt on this and we confirmed that there was an issue with the RDS instance the metadata db lives on. RDS auto-recovery was kicked off (probably due to a hardware failure), which took about 5 minutes to complete. However, the connection in meltano took around 2 hours to time out, at which point we saw this error. I'm guessing it didn't try to re-connect to mark the run as failed because of the nature of the error?

dean_morin

08/29/2024, 12:33 AM

I asked a related question here: https://meltano.slack.com/archives/C069CPBCVPV/p1724891607295699

Edgar Ramírez (Arch.dev)

08/29/2024, 3:04 PM

What version of Meltano is this? (I'm trying where exactly it's failing and come up with an MRE).

dean_morin

08/29/2024, 5:09 PM

v2.20.0, we still haven't migrated to v3

Edgar Ramírez (Arch.dev)

08/29/2024, 7:20 PM

Oh gotcha

Edgar Ramírez (Arch.dev)

08/29/2024, 8:18 PM

Ok, so that probably means Meltano is on SQLAlchemy 1.4 but the error seems to be coming from the more stable parts of their API so I don't think bumping versions would make things better here... The error is ultimately a

psycopg2.OperationalError

crashing things here: https://github.com/meltano/meltano/blob/4bda2aa5ae8d260f5d031c00cce77fb0b478af2f/src/meltano/core/settings_store.py#L933-L947 So I wonder if: 1. We should try catching more exceptions there, but it's not clear to me which ones 2. The

psycopg3

adapter would handle things better, but that does require a bump to v3 🤔

🙌 1

dean_morin

08/29/2024, 8:34 PM

I'm going to finally have to have time to work on the v3 update in the next few weeks, so I'll get that much done at least

👌 1

dean_morin

10/07/2024, 5:06 PM

To follow-up here, we're now running Meltano 3.42 and are still seeing this issue

visch

10/07/2024, 7:58 PM

can you post the additional information now that you're upgrade, logs, debug logs, meltano.yml etc

Edgar Ramírez (Arch.dev)

10/07/2024, 10:40 PM

also @dean_morin, are you using

psycopg3

dean_morin

10/07/2024, 11:06 PM

We're using

psycopg2-binary 2.9.9

. Added with

poetry add meltano@3.4.2 --extras psycopg2

dean_morin

10/07/2024, 11:06 PM

I'll get back to you on those other details

👍 1

dean_morin

10/11/2024, 11:25 PM

Hey sorry for the delay, I'll DM you the files @visch

visch

10/15/2024, 12:33 PM

Can you just put them here, odds of me having the time to go do this for you isn't high we try to have the community help

dean_morin

10/16/2024, 7:59 PM

Definitely, here they are

meltano.yml meltano_logs.csv

2 Views

Open in Slack

Previous Next