I hope everyone is enjoying their long weekend in the US I a Meltano #singer-target-development

I hope everyone is enjoying their long weekend in ...

christoph

11/25/2022, 4:04 AM

I hope everyone is enjoying their long weekend in the US. I am wondering if anyone ever had a situation where they needed to capture (and store) the response values from a HTTP REST API target? I.e. I want to talk to a REST API with Meltano to create a new record and update existing records in the destination system. The destination system assigns its own new primary key ID on create and provides it in the response to the POST request. I want to capture that primary key ID from the destination system and store it in my own internal "tracking" system to link it up with my own internal (original) copy of the record.. My internal tracking system is basically the source of truth where new records are born. But I need to keep a track of the primary key ID of the "synched" records from the external HTTP REST API destination. Anybody ever thought of doing something like this with Meltano. (The closest thing I could think of was Derek's AutoIDM solution)

christoph

11/25/2022, 4:28 AM

Right now, my starting point is this

_after_process_record()

hook. https://github.com/meltano/sdk/blob/7962ebc9a9a0ff14e77e6aeb1b76d8e5cda2bc73/singer_sdk/sinks/core.py#L341-L347 I'd just need to figure out how much data will already be in the

context

and how much I would need to push onto the

context

in addition.

aaronsteers

11/29/2022, 5:55 AM

@christoph - We haven't defined an official method of adding artifacts like these in the sync operation. Taps can freely put anything they want to in

STATE

but there's no similar option for targets. I've opened a new discussion on this topic here: SaaS targets: strategy to maintain state and/or other artifacts created · Discussion #1229 · meltano/sdk (github.com)

aaronsteers

11/29/2022, 5:55 AM

cc @visch, @edgar_ramirez_mondragon 👆

christoph

11/29/2022, 6:09 AM

Thanks AJ! Makes perfect sense to me. My current use case actually falls squarely into the

surrogate_key_lookup_table

bucket. And that's what I have put together for now using the Target SDK

christoph

11/29/2022, 6:12 AM

Since my target is not a SQL target, I just use a redis list as the storage (since I already have Redis in my tech stack) and then I just have another pipeline that picks up all those JSON strings from the Redis list to put them back into the Datawarehouse staging area, so I can use those lookup tables in my models for future "synch" runs. It's not the most elegant solution, but that's what I was able to cobble together for now without much hassle.

visch

11/29/2022, 1:40 PM

My thought with this is that if you think about the target like a "mapper" in the sense that it's a tap and a target you kind of get all of this for "free" (Minus a guarantee that something is consuming the stdout data). Today I currently just log the parts of the json response I want if there's any. There are some very valid use cases for tracking things like What did the target change? For saas use cases this can be very nice right now I kind of fake it as I keep track of what was sent to the target, not what the target actually changes

visch

11/29/2022, 1:40 PM

I'm not doing it but that's the extent of my thoughts this far on it 🤷

visch

11/29/2022, 1:42 PM

But I need to keep a track of the primary key ID of the "synched" records from the external HTTP REST API destination.

I have needed this a few times, but luckily it's always with the same system. IE create an account in AzureAD and I need to add a manger to the account with that ID, and add that ID to groups in AzureAD. So it's pretty simple. For other integrations I tend to say that they should use tap-azuread (in this case) to pull the data themselves

visch

11/29/2022, 1:44 PM

My conclusion thus far is I don't need it to get the job done and it keeps things pretty simple. It would definitely help simplify some of my transformation logic if I followed a standard like "saas targets must output the record send to the target as a record in stdout".

visch

11/29/2022, 1:45 PM

Ok one new thought that seems interesting: We want saas targets to define a schema anyway for what the accept. The schema we define for the output from the saas ~~target~~ mapper would be the same schema (maybe?)

visch

11/29/2022, 1:46 PM

Maybe you even call them a different name than a target as we'd enforce the behavior? Anyways there's some ideas 😄

Open in Slack

Previous Next