Hello all I was hoping for some insight on how people are ru Meltano #troubleshooting

Hello all. I was hoping for some insight on how pe...

alexander_shea

10/26/2023, 11:41 PM

Hello all. I was hoping for some insight on how people are running meltano as part of a python application without creating a subprocess. Today we are executing oss meltano as part of our Prefect flow. Prefect is operating on the primary process which is being managed by containerd and kubernetes in production. I am working to get rid of running subprocesses in docker containers, but failing to find a good solution to do so with meltano. The reason for this is containers are designed to run and maintain a single process. If something happens on a subprocess that causes a critical error, the container manager doesn't usually have an understanding as to why it failed. I would like to get rid of the subprocess rather than duplicate process management into python that already exists in the container runtime. Thanks in advance for any thoughts, ideas, guidance, or questions presented. All of it will be helpful in some way.

christoph

10/27/2023, 1:13 AM

The reason for this is containers are designed to run and maintain a single process.

Can you elaborate on this? I understand that PID 1 is different in container runtimes that on a normal Linux system. But I don't think there are any constraints beyond the differences in PID 1 that influence whether or not your PID 1 should have child processes inside a container runtime? Are you asking if the meltano CLI is designed to be seamlessly run as PID 1 in a container runtime?

alexander_shea

10/30/2023, 2:59 PM

Thanks @christoph for the clarifying questions. Meltano does supply an image that runs the cli tool as the standard process. So there is no limitation to running Meltano in the container. There is also no technical limitation to running subprocesses inside of a container either. The problem is the ownership and management of the process. When you start a container, the command argument takes on the first process as you suggest. Anything to do with that process will be managed by the container runtime. If a subprocess is started within a container, the container runtime does not manage that process. If that process has a critical error which crashes the container, the container runtime will not know about this. Attached is an example within a prefect runtime. The prefect worker is listening to events on regarding the pod which it deployed to execute the flow. There was a failure in the subprocess as Meltano was running that was not captured by the container runtime.

edgar_ramirez_mondragon

10/30/2023, 5:32 PM

I'm not familiar with how Prefect runs subprocesses, but there might be a way to communicate errors in a subprocess to the main process, right? That said, there's an issue to expose the core of Meltano as a public python API and publish it as a library, but no progress has been made in that direction. Discussion and PRs are welcome though!

Open in Slack

Previous Next