pat_nadolny
10/28/2022, 6:21 PMpat_nadolny
10/28/2022, 6:21 PMpat_nadolny
10/28/2022, 6:22 PMpat_nadolny
10/28/2022, 6:26 PMandy_crowe
10/31/2022, 9:07 PMparquet
format (and pattern for future formats) here — still lots of tweaking needed, but I think this will be functional for me at the moment, feedback welcome 🙂andy_crowe
11/01/2022, 3:29 AMsmart_open
, I think we could use this as a foundation and build a multi-cloud, multi-format target with this pattern. I didn’t quickly see how to save parquet
with the smart_open
library, would be happy to refactor it.aaronsteers
11/01/2022, 6:40 AMparquet
file format. DuckDB
or arrow
could in theory be used to generate the Parquet dataset, then smart_open
or PyFilesystem
could be used to upload/write the file bytes to the respective cloud.
I've added a comment to this effect to my new issue proposal for a generic multi-cloud target. To be clear, there may be other challenges I've not foreseen. In total, the dev effort could be significant... but I do feel that it would be a valuable addition to our currently available targets if there's a path forward here.pat_nadolny
11/01/2022, 1:28 PMwhat repo should this live in? personal repo to start?@andy_crowe awesome to hear you've made progress! Its up to you really. Many people leave them in their own personal or organization's github repo. But if you dont want it in your personal repo, we've created MeltanoLabs for that exact reason (see this blog post for what we view as the connector ownership models) , you have the option to have the repo live there but you'd still be the primary maintainer. Or you could always wait and migrate it out of your personal namespace down the line. Its up to you
pat_nadolny
04/03/2023, 5:52 PMandy_crowe
04/03/2023, 6:21 PMandy_crowe
04/03/2023, 6:54 PMpat_nadolny
04/03/2023, 7:33 PM