<!channel> Hi, Docker users. We were talking today...
# docker
a
<!channel> Hi, Docker users. We were talking today in #C01QS0RV78D about potentially dockerizing taps and targets. One issue that came up was how/if these could then be orchestrated in something like ECS or ECS Fargate. Does anyone have experience in this area (Docker-in-Docker or "Dind") on AWS containerization tools or those from another cloud?
@florian.hines - By chance do you know if this would also apply in Kubernetes-based offerings like EKS?
f
I’m not sure about EKS but I think it was doable with GKE at one point. I think GKE deprecated docker, so might be you end up having to spin up a
dockerd
pod yourself or something along those lines.
j
I’ve never tried to run a “Dind” process, but I am using Meltano inside Dagster instances which are running on ECS Fargate. I know for sure that Fargate is no locker using Docker, they are using whatever docker is built on (can’t remember the name) to manage the Fargate service. The way Dagster handles this is (instead of “Dind”) with their ECS run launcher they actually grab the code from one long-running container and inject that code into a separately defined task definition. It then issues the ECS commands to spin up that job. Granted this EcsRunLauncher in dagster is still experimental so it could change. But I’ve been using it in production for almost a month now and haven’t had major beefs with it. I’m not sure if this answers your question … ?
a
@josh_lloyd - I think the challenge is, if the container you are running for a pipeline wants to invoke more containers, for example if Meltano wants to containerize the tap and target separately, we don't know which services (ECS, etc.) actually support that capability.
The way Dagster handles this is (instead of “Dind”) with their ECS run launcher they actually grab the code from one long-running container and inject that code into a separately defined task definition.
This might actually do the trick. Perhaps we should look into this... @florian.hines
I think GKE deprecated docker...
I think that's true, but (tangent warning : ⚠️ ) Dockerization is actually often a misnomer (like "Kleenex" in the US for paper tissues) and I'm very guilty of saying "dockerization" when the more accurate is "containerization". Since Docker generates OCI-compliant containers and implement the CRI interface, the fact that a runtime "doesn't support docker" may or may not actually be an issue - and this created similar confusion when Kubernetes reportedly "dropped docker container runtime support". Unless we mean to say there's no ability to execute "docker run" or similar from within the container - which I guess is also possible and something we need to check into.
f
yea i think mine was more along the lines of “i don’t know if you can still mount a docker socket within the running container”
has a section on migration that might be worth a skim.
a
Rather than dind, it feels like the outer execution really belongs in an Orchestrator
f
+1, outside some specific scenarios (like CI setups) or local dev images I generally try to avoid docker in docker shenanigans.
a
Yep. I think it's probably useful to maybe think in terms of dandd (yes i just made that up)... but basically two entirely separate docker invocations, probably sandwiching any number of orchestration layers. In the context of Meltano, maybe something like: • dockerized Meltano service • Orchestrator • dockerized sub-processes
So then whatever the docker service might be (fargate, batch, k8s, etc), that is just a matter of specification for the Orchestrator; so iow various Airflow operators, etc. Just my 2 cents.
k
I have been trying to avoid Dind situation unless I really have to. I am quite happy with an image in ECR which I can just pull and spin up different ecs fargate running different tasks in parallel by just overriding the command. Not sure what’s the use case or problem that dockerizing taps and targets are trying to solve however.