Hello, How to create state object in tap manually....
# troubleshooting
c
Hello, How to create state object in tap manually. For example, I'm backfilling the data in chunks with 1 day of Interval.
Copy code
Chunk1
from date : 2023-01-01
end date : 2023-01-02

Chunk2
from date : 2023-01-02
end date : 2023-01-03

Chunk3
from date : 2023-01-03
end date : 2023-01-04
I want to send state message with updated replication key value (
2023-01-02
,
2023-01-03
,
2023-01-04
) at the end of each chunk. Is there any simple way to do it from tap's client.py ?
u
Some taps implement an end_date setting but I dont think theres an easy way to do this if its not implemented. Can yuo share more about why you need to export in chunks like this?
c
In some tap we don't have data sorted. So I set
is_sorted
false and then tap send state message with
progress_marker
key. With this If backfilling gets interrupted, It restart the backfilling process because
progress_marker
never gets purged.
Any way to achieve this ? I want to set value for state manually from tap and send it to loader.
u
@chintan_patel I believe that this is a safety feature in the SDK. If the stream is unsorted then it wont write out state until its confident that all data up to that date has been loaded and if its an unsorted stream then the entire stream needs to be synced before it can be confident of that. If you choose to manually write out state bookmarks before the sync is completed then it could cause data loss. Is there a way to design the tap to sort your streams?
u
This thread might be helpful for ideas around potentially batching to force sorted streams https://meltano.slack.com/archives/C01PKLU5D1R/p1682012587563729?thread_ts=1681998358.444049&cid=C01PKLU5D1R