(singer-sdk question) What if I have a replication...
# singer-tap-development
v
(singer-sdk question) What if I have a replication_key that's not sortable as it's a GUID but I don't want to restart from scratch if the stream is interrupted as there could be a lot of data! Dove into the code and didn't have any immediate ideas for how to do this quickly, thought I'd shoot over a question
In this case it's a cursor that we "should" be able to trust
a
You can override this function either via monkey patch or override this method on the Stream class to call your implementation. In your updated function you can drop the
>=
check here in the short-circuited comparison. That way state updates on each records as a presumption for your specific stream class.
v
Yeah, I don't know why I was hunting so hard for a different solution. Thanks @alexander_butler
e
I won’t encourage overriding the guts of the SDK state management 😅, so wouldn’t setting the
check_sorted
attribute of the stream to
False
work? A github issue would be appreciated if it doesn’t.
v
@edgar_ramirez_mondragon My understanding from reading through the function is that
check_sorted
= False , makes it so the replication_key_value isn't incremented along.
a
That was my impression too but I couldve read it wrong? Ternaries like that can be a bit confusing lol
v
til there's
is_sorted
and
check_sorted
oh my I thought they were the same thing
Still trying to see if it's true or not ha, @edgar_ramirez_mondragon has me questioning everything
e
check_sorted = False
makes this expression true:
Copy code
old_rk_value is None or not check_sorted or new_rk_value >= old_rk_value:
so it’ll increment the replication key
v
reading through it again
is_sorted
= True, and
check_sorted
= False should give me what I'm after. I'm wrong! I didn't see the is / checked on my multiple reads through this some how!!
Thank you @edgar_ramirez_mondragon even better, no action needed
a
Thanks for clarification @edgar_ramirez_mondragon
v
I'll write up something as an issue so we can get it documented.
e
Thanks @visch!
v
So my gut feeling was right, just need to get the words out for why lol. Override = bad if possible, but for some reason I was so stuck in a different mindset for this one. Thanks again @alexander_butler and @edgar_ramirez_mondragon really appreciate it
d
I had something similar, but in my case I had to go backwards from the most recent entries til it reaches the previously saved state key.
is_sorted=True
/
check_sorted=False
didn’t help though. Is there a way to preserve the last successful state’s key to run from it again after the stream is interrupted?