andrey_tatarinov
02/24/2023, 7:42 AMevent_date
to each event and rely on built-in functionality?
• How do I ensure that bookmark is emitted at the end of the batch?andrey_tatarinov
02/24/2023, 8:34 AMevent_date
if my Stream is based on RESTStream.
My intuition is that I should do it in post_process
, but I do not have a reliable event_date
inside of the response, I would like to explicitly put the value, that I provided in get_url_params
.
Currently I do not understand how to pass state from get_url_params
to post_process
andrey_tatarinov
02/24/2023, 12:16 PMrequest_records
with my custom logicandrey_tatarinov
02/24/2023, 12:17 PMself.finalize_state_progress_markers()
self._write_state_message()
To send proper state change messages after each chunkvisch
02/24/2023, 1:08 PMvisch
02/24/2023, 1:09 PMandrey_tatarinov
02/24/2023, 3:09 PMandrey_tatarinov
02/24/2023, 3:09 PMandrey_tatarinov
02/24/2023, 3:11 PMvisch
02/24/2023, 3:12 PMtate file contains duplicate entries for partition: {state_partition_context}.[0m [36mcmd_type[0m=[35mextractor[0m [36mname[0m=[35mtap-indeed-retractedd[0m [36mrun_id[0m=[35me08ff2e0-9117-4b20-a115-a4c9820840e4[0m [36mstate_id[0m=[35mextract-tap-indeed-retractedd[0m [36mstdio[0m=[35mstderr[0m"
24 February 2023,12:09:37 MST,prefect.extract/load: indeed master account c,INFO,"[2m2023-02-24T07:09:37.048874Z[0m [[32m[1minfo [0m] [1mMatching state values were: [{'context': {'_sdc_employer_id': 'retracted'}, 'replication_key': '_sdc_start_date', 'replication_key_value': '2023-02-22'}, {'context': {'_sdc_employer_id': 'retracted'}, 'replication_key': '_sdc_start_date', 'replication_key_value': '2023-02-23'}][0m [36mcmd_type[0m=[35mextractor[0m [36mname[0m=[35mtap-indeed-retractedd[0m [36mrun_id[0m=[35me08ff2e0-9117-4b20-a115-a4c9820840e4[0m [36mstate_id[0m=[35mextract-tap-indeed-retractedd
visch
02/24/2023, 3:12 PMvisch
02/24/2023, 3:13 PMandrey_tatarinov
02/24/2023, 3:14 PMrequest_records
and found a field in data that correlates with my queries: https://github.com/epoch8/tap-appmetrica/blob/master/tap_appmetrica/client.py#L66aaronsteers
02/26/2023, 2:16 AMandrey_tatarinov
02/26/2023, 11:52 AMaaronsteers
02/26/2023, 8:28 PMaaronsteers
02/26/2023, 8:41 PMreplication_key
exists on the records themselves. I am curious if this is throwing an error for you, or if you have perhaps worked around that successfully.
2. The context
param dict could in theory have something like "server_capture_window_end_date" injected to it (which then could be added to records via post_process()
), but modifying context not a previously tested pattern and might have implications I'm not thinking of.
3. The combination of simultaneously tracking finalized and non-finalized state markers in the same stream is another pattern not tested within the SDK. It might work totally fine, but I cannot say for sure without driving deeper into the code.aaronsteers
02/26/2023, 8:45 PMvisch
03/01/2023, 2:50 AMcontext
to inject things which breaks how state
is handled today in some waysandrey_tatarinov
03/02/2023, 3:49 PMvisch
03/02/2023, 3:49 PM