Any thoughts on running Meltano ELT pipelines in a...
# infra-deployment
f
Any thoughts on running Meltano ELT pipelines in an AWS Lambda function?
t
@ken_payne has some I’m sure when he’s back on. The big thing is that since lambdas are stateless and have a fixed max runtime, you’ll have to account for that in the source. But no reason you couldn’t use them 👍
e
the biggest bummer for me is the hard 15 minute timeout. You'd have to make sure the process is cleanly terminated so state is saved, and that can be very tricky
f
We plan on running these particular jobs every 5 minutes. I don't think we will be extracting and loading more data than can be produced in 5 minutes, so for this use-case I'm thinking we are OK, at least for a POC until a more elaborate solution is designed and deployed.
h
I'm happily running them in AWS batch, utilizing fargate underneath and an rds postgres db
With a five minute interval, the only concern I can think of is that you don't trigger 1 lambda while the other is still running
f
I believe that might be handled by the database, as I force cancelled a job before and it warned me that it was not complete, and I had to -force the new job (same job_id). Can anyone confirm that?
c
you just need to setup a persistence rds yep and it will prevent concurrency
v
AWS Batch works really well if you have a container you can use. I've used it for a number of things and it tends to just "work"
Just as simple, arguably simpler than lambda imo but that's if you exclude the container need