I'm taking a stab at targets and throwing my hat i...
# singer-tap-development
d
I'm taking a stab at targets and throwing my hat into the ring with a
target-salesforce
🤠 I'm debating how to handle update/insert/upsert/delete/hard deletes. Each type is a different API call with
simple-salesforce
• Does it make sense to have the streams segmented by type? ā—¦ Example
Account-delete
,
Account
(default to update),
Account-upsert
.... • Or do I keep all the records in the
Account
stream and allow for an optional
_type
field to dictate the behavior? This seems cleaner. ā—¦ This is easy if I write each record individually, but not sure about sub-segmenting a batch sink. Any ideas? Is there a target that has already addressed something similar?
v
I don't know what works best but what I use that works from your example is
Account
stream, with a
action
column (probably a better name would be
_sdc_action
) with values of
insert
update
delete
The thing that would be very helpful to have in these types of targets is a schema that the stream should conform to. Wouldn't be that crazy, I just think the record should have to conform to the schema
Account
accepts. Right now I just do
record["name"]
,
record["id"]
etc etc, and you can run into typing issues with that.
The actual magic comes in at the transform layer, which we'll end up wanting to write some abstractions for but after you get there let me know and we can chat! šŸ˜„
Beauty of this approach though is that targets stay extremely simple and dumb. And it forces you to think of idempotent ways to handle sending data to your target. There's some things that are a bit tricky and that would be nice like it'd be really nice to know what the target actually attempting to send, it's almost like the target should behave like a mapper taking data in and outputing data back to your wh for logging
Highly recommend just going for the simple approach you're after it works wonders and the other hard things can wait with a decent transformation design!
d
yea, I'm doing validation that ensures all fields in the stream exist in the SF object. It just pops and logs invalid fields for now, but would be nice if it could do more proactively and log the expected schema back to the wh. I haven't touched type validation yet though.
I think I'll probably go the
_sdc_action
route and batch them in their own
Account-delete.jsonl
Account-update.jsonl
... files and then write the files when reaching 10k Account records.
v
Everything makes sense to me! except I'm confused on why you're write to
.jsonl
file and then reparse the file but what do I know
d
What should I be batching it too, just keep it in memory?
v
That's what I do, but I have no idea what anyone "should" do for much of anything lol
I'm just curious why and what benefit a .jsonl file has, if sfdc has some native way to import jsonl files then that makes sense to me!
d
None! I'll reconsider that haha
v
šŸ˜„ good luck hope it goes well I have some examples but I can't share most quite yet https://github.com/AutoIDM/target-apprise/blob/main/target_apprise/sinks.py is pretty darn close to what I do for all of the minus the action field
a
I'm late to this thread but I wanted to mention there's a convention of a
_sdc_deleted_at
property which signifies a record was or should be deleted.
This is mostly used in things like LOG_BASED replication, where there's a clear record of the deletion observable. You could use the same convention and that could trigger a hard delete if you want. 🤷
v
That convention implies a certain type of modeling requirement that is interesting but I think falls apart when you take into consideration the number of records to a saas target is relatively low, you have much less control of a saas target than a dw target, and in extension to the previous place things can change from lots and lots of places in saas targets