Question for those using Postgres Tap and Target ...
# best-practices
e
Question for those using Postgres Tap and Target I have a area I will call the "Data Downloader" .. running Meltano.. it taps a data source and pushes it to a postgresql DB that is running next to meltano in a container.. I have another area called " Production Env" .. running postgresql and stores the prod data .. My question is.. how do a best append the data from 1 postgresql DB .. to the other ? From the Postgresql DB in DataDownloader, to the Postgresql DB in Production Env? .. new data from the downloalder should not erase what is already in production.. I cannot connect the two together locally on LAN.. so was thinking I need to take .sql snapshots of the data and save it to parquet or .sql file or...? .. walk the data via USB drive (the two environments are airgapped for security reasons) ..
and yes I am aware that USB drives will be a risk, but with sanitation a more controllable way to manage the two envs than if I explicitly connected them via network
thanks for any ideas or suggestions... maybe I just .zip up something and push it to S3 .. then have Meltano in the Production env pull it down periodically?
a picture of the network topology ..
so as you can see it's not TOTALLY isolated but.. via LXD and multiple NATS
t
Yeah, it seems like either writing to a generic blob store might be a good solution. You can then run another ELT for that data? I’m not personally a fan of walking data over, but if you must… 😕
FYI @edgar_ramirez_mondragon since you were working on a generic blog target 😄
e
yup, it's at https://github.com/MeltanoLabs/target-jsonl-blob/. It should be useful to push blobs of singer messages to a staging area in object storage and then download + pipe those into a singer target.
e
@edgar_ramirez_mondragon, may I give this a spin with S3? this sounds perfect for my usecase and I was going to write something myself but if there’s already a target happy to give it a spin if it supports the security model I am trying to hold Edit: nice I see the s3 support.. yeah I am all over this target then.. if there are any gotchas I should know before use lmk and maybe I can give it a try as well to PR something
e
@emcp biggest gotcha may be lack of rollback support atm. if the pipeline fails, you might still end up with files in s3