Is there any more documentation on deploying to AW...
# infra-deployment
m
Is there any more documentation on deploying to AWS, specifically ECS? It seems like what's on the official Meltano docs regarding prod deployment is relatively generic and doesn't specify best practices on AWS service configuration. Google / ChatGPT isn't giving good results either given how esoteric all of this is (not enough training data lol). Also doesn't seem to be many examples in this slack as I can't seem to search for messages past 90 days... I did see that there's an unfinished blog post floating somewhere on this topic? https://gitlab.com/meltano/meltano/-/issues/590 Looks like the blog post / docs were never written...
a
Can't answer your AWS question but you can search full chat history at https://www.linen.dev/s/meltano
🙏 1
m
I've deployed Meltano using containers, but I use my own EC2 instance with Docker (and Airflow for orchestration) as opposed to having ECS manage running the containers. I can answer questions on the containerization side, but AWS has so many ways to do things it's hard to cover them all 😅
1
v
At the end of the day there's an infinite number of ways to schedule a process to run. Meltano is a process. You need to run it, capture stdout/stderr, track the error code somewhere and you're good. That's the mental model, just pick something your team already uses
1
If your team doesn't have something, then use Gitlab CI, or Github Actions
m
@mark_estey So based on the chat history, it seems like no one runs airflow in ecs and most everyone uses MWAA to trigger ECS containers with operators. Is the reason for this that the offloading of infra to AWS is worth the cost and limited configurability of MWAA?
m
A lot of infrastructure choices will be dependent on the stack you use and the scale you need, so it's not a one-size-fits-all. I simply wanted Airflow since it fit into my stack neatly and was open-source and portable so it wouldn't lock us into AWS as I could stand it up anywhere else super easy because it was containerized, along with the rest of our workflows. Also when I looked at ECS it seemed better for workloads where there are many short-lived containers with varying hardware needs (CPU vs memory) or to get cost savings for workloads that could be delayed to run on spot prices. But for long-lived services where containers were running 24/7/365 like Airflow, GitLab, or other services, ECS ended up being more expensive than just getting a dedicated EC2 host running Docker.
🙏 1
1
m
Really appreciate the info--thank you!
d
@Michael Bi we have a blog on ecs + self hosted AWS and terraform for deployment here