Anup N
08/13/2024, 2:58 PMEdgar Ramírez (Arch.dev)
08/13/2024, 4:38 PMAlloy dynamically (the columns aren't constant) type casts the columns data type before inserting it in Bigquery. I couldn't figure out a way to do this using Meltano.Do you mean auto-detecting the column type based on the source metadata? If the gsheets extractor doesn't currently do that (I'm not aware), it shouldn't be too hard to implement actually. I'd take a look at https://github.com/Matatika/tap-google-sheets/ to confirm.
Also, I am not sure how the infra is going to be set up. I am doing it as a pet project with the motive of making our connectors list much better and would love to hear your opinions on building a solid infra which scalesOne advantage of Meltano is you can start rather small (e.g. on GitHub actions) and incrementally migrate to a cloud/containerized environment using an S3 state backend, for example. The latter is how Arch.dev runs Meltano, so it definitely works. I'm gonna let other comment on the particulars of the connectors in case they've used them.
Anup N
08/17/2024, 10:51 AMEdgar Ramírez (Arch.dev)
08/18/2024, 6:58 PMI see the way to handle this is by writing a custom transformationHey, is this a recommendation you saw in the docs or somewhere else? You could always override the extractor schema.
We run on Google Cloud PlatformThis might be helpful then: https://medium.com/data-manypets/how-to-run-meltano-in-a-container-on-google-cloud-composer-860783d0575c
Anup N
08/21/2024, 3:10 AMAnup N
08/22/2024, 6:29 AMEdgar Ramírez (Arch.dev)
08/22/2024, 11:01 PMQuestion 1: How to run a meltano pipeline for multiple accounts/credentials (with different configurations) without changing anything in the source configuration (as once meltano project is hosted it should be ideally immutable). The configuration should be dynamic both for extractors as well as loadersYou could use environment variables: https://docs.meltano.com/guide/configuration/#configuring-settings For example, some people create an ECS task definition with a Meltano container, then pass environment variables at runtime.
Question 2: Can I run concurrent pipelines.This is easy to do in the ECS example I mentioned above: a single task definition can be run concurrently any number of times.
Question 3: How can I configure meltano to run on a trigger basis (say from an endpoint or a webhook)This isn't natively supported by Meltano. Specially the latter two of these requirements are probably outside of the scope of Meltano itself, since it's not a long-running service but rather a command line application that runs to completion and then exits. So, it can't fan-out for multiple configurations, or react to external events via webhooks, or run multiple pipelines in parallel on its own. Is that helpful?
Anup N
08/24/2024, 11:26 AM