Hey I am wondering if I use meltano for multi-clou...
# random
h
Hey I am wondering if I use meltano for multi-cloud data transfer. Will it saves cost ? Use-case: Transferring data table from bigQuery to Aws S3. Else use AWS Glue plugin Has someone done this before using meltano. I bet few of you must 😉
a
https://cloud.google.com/bigquery/docs/omni-aws-export-results-to-s3
Copy code
EXPORT DATA WITH CONNECTION CONNECTION_REGION.CONNECTION_NAME
   OPTIONS(uri="<s3://BUCKET_NAME/PATH>", format="FORMAT", ...)
   AS QUERY
Is this useful? It saves you quite a bit of work if it works for you.
a
Was going to suggest something similar... if you can export data from GCP to S3 natively, you can save on egress/ingress costs. With Meltano Singer SDK's batch message support, you could (in theory) add S3 batch support to tap-bigquery. To my knowlege, tap-bigquery doesn't yet support native S3 BATCH messaging, but it's something that could be added for sure. @alexander_butler may know more about pricing of export to S3, but this does depend on how/if Google charges for the EXPORT DATA syntax that Alex describes above.
(Even without this, you could in theory, run your own export process, and then pick up the files then from S3.)
h
I am thinking to setup tap-bigquery with target-s3
j
@aj_steers how would I go about installing S3 batch support to
tap-postgres
. Based on this thread the the conversation in #1087 I understand that the
s3
extra has to be added to the
singer-sdk
tap dependency. I've tried that with
singer-sdk = { version="~0.33.0", extras = ["s3"] }
but I'm still getting the same error:
fs.opener.errors.UnsupportedProtocol: protocol 's3' is not supported
There's probably a step I'm missing here but I'm not sure how to proceed. Thanks in advance!