I m taking a stab at targets and throwing my hat into the ri Meltano #singer-tap-development

I'm taking a stab at targets and throwing my hat i...

dan_ladd

10/04/2022, 2:29 PM

I'm taking a stab at targets and throwing my hat into the ring with a

target-salesforce

🤠 I'm debating how to handle update/insert/upsert/delete/hard deletes. Each type is a different API call with

simple-salesforce

• Does it make sense to have the streams segmented by type? ◦ Example

Account-delete

Account

(default to update),

Account-upsert

.... • Or do I keep all the records in the

Account

stream and allow for an optional

_type

field to dictate the behavior? This seems cleaner. ◦ This is easy if I write each record individually, but not sure about sub-segmenting a batch sink. Any ideas? Is there a target that has already addressed something similar?

visch

10/04/2022, 4:46 PM

I don't know what works best but what I use that works from your example is

Account

stream, with a

action

column (probably a better name would be

_sdc_action

) with values of

insert

update

delete

The thing that would be very helpful to have in these types of targets is a schema that the stream should conform to. Wouldn't be that crazy, I just think the record should have to conform to the schema

Account

accepts. Right now I just do

record["name"]

record["id"]

etc etc, and you can run into typing issues with that.

visch

10/04/2022, 4:47 PM

The actual magic comes in at the transform layer, which we'll end up wanting to write some abstractions for but after you get there let me know and we can chat! 😄

visch

10/04/2022, 4:47 PM

Beauty of this approach though is that targets stay extremely simple and dumb. And it forces you to think of idempotent ways to handle sending data to your target. There's some things that are a bit tricky and that would be nice like it'd be really nice to know what the target actually attempting to send, it's almost like the target should behave like a mapper taking data in and outputing data back to your wh for logging

visch

10/04/2022, 4:50 PM

Highly recommend just going for the simple approach you're after it works wonders and the other hard things can wait with a decent transformation design!

dan_ladd

10/04/2022, 4:52 PM

yea, I'm doing validation that ensures all fields in the stream exist in the SF object. It just pops and logs invalid fields for now, but would be nice if it could do more proactively and log the expected schema back to the wh. I haven't touched type validation yet though.

dan_ladd

10/04/2022, 4:55 PM

I think I'll probably go the

_sdc_action

route and batch them in their own

Account-delete.jsonl

Account-update.jsonl

... files and then write the files when reaching 10k Account records.

visch

10/04/2022, 4:59 PM

Everything makes sense to me! except I'm confused on why you're write to

.jsonl

file and then reparse the file but what do I know

dan_ladd

10/04/2022, 5:00 PM

What should I be batching it too, just keep it in memory?

visch

10/04/2022, 5:00 PM

That's what I do, but I have no idea what anyone "should" do for much of anything lol

visch

10/04/2022, 5:01 PM

I'm just curious why and what benefit a .jsonl file has, if sfdc has some native way to import jsonl files then that makes sense to me!

dan_ladd

10/04/2022, 5:14 PM

None! I'll reconsider that haha

visch

10/04/2022, 5:22 PM

😄 good luck hope it goes well I have some examples but I can't share most quite yet https://github.com/AutoIDM/target-apprise/blob/main/target_apprise/sinks.py is pretty darn close to what I do for all of the minus the action field

aaronsteers

10/04/2022, 10:20 PM

I'm late to this thread but I wanted to mention there's a convention of a

_sdc_deleted_at

property which signifies a record was or should be deleted.

aaronsteers

10/04/2022, 10:21 PM

https://sdk.meltano.com/en/latest/implementation/record_metadata.html#sdk-implementation-details-record-metadata

aaronsteers

10/04/2022, 10:22 PM

This is mostly used in things like LOG_BASED replication, where there's a clear record of the deletion observable. You could use the same convention and that could trigger a hard delete if you want. 🤷

visch

10/05/2022, 1:01 AM

That convention implies a certain type of modeling requirement that is interesting but I think falls apart when you take into consideration the number of records to a saas target is relatively low, you have much less control of a saas target than a dw target, and in extension to the previous place things can change from lots and lots of places in saas targets

3 Views

Open in Slack

Previous Next