aaronsteers
05/19/2021, 12:06 AM1. No support for “raw artifacts”:
I typically write taps against metered/paid APIs and send them to an s3/gcs target. I want these to be in as raw a format as possible (ELT) so if there is any issues with code, I don’t have to call the APIs again and incur a cost.
For example, the raw artifact could be an API response, or it could be a ZIP file from an ftp server.
My thoughts: I think the singer spec can be extended to support API -> Artifact -> Records to support this and have a design in mind.This relates to something @ken_payne was working on, where the interim artifact was a Tableau workbook and we wanted to reuse the artifact for multiple child streams. In that case, we did not cache the file, but that has sparked some of the work we are doing on parent-child streams, to allow intermediate ‘parent streams with artifacts’ to seed their data into child streams. Does that sound close to what you are looking for?
aaronsteers
05/19/2021, 12:07 AM