adam_roderick
12/02/2021, 3:00 PMadam_roderick
12/02/2021, 3:07 PMis_sorted
does not work for me, and it looks like progress_tracking
will not work either, because This is used to track the max value seen for the replication_key during the current sync.
adam_roderick
12/02/2021, 3:17 PMadam_roderick
12/02/2021, 3:17 PMtaylor
12/02/2021, 4:39 PMaaronsteers
12/02/2021, 4:49 PMadam_roderick
12/03/2021, 3:31 PMadam_roderick
12/03/2021, 3:32 PMaaronsteers
12/03/2021, 6:07 PMaaronsteers
12/03/2021, 6:12 PMadam_roderick
12/03/2021, 7:06 PMadam_roderick
12/03/2021, 7:06 PMadam_roderick
12/03/2021, 7:06 PMaaronsteers
12/03/2021, 7:29 PMdict
that defines the pagination token and its end point:
next_page_token = { "stop_point": self.get_starting_timestamp(), "current_page": 1 }
If the result of the request gives data past the stop point, return None
from get_next_page_token() and that'll signal to stop the loop.aaronsteers
12/03/2021, 7:30 PMaaronsteers
12/03/2021, 7:30 PMadam_roderick
12/03/2021, 8:46 PMadam_roderick
12/03/2021, 8:46 PMaaronsteers
12/03/2021, 11:43 PMadam_roderick
12/04/2021, 2:45 PMadam_roderick
12/04/2021, 2:50 PMNone
from get_next_page_token
and the next thing that happens is an info message, then the error
```--------- end of 'get_next_page_token'. Returning next_page_token: {'stop_point': None, 'current_page': 11}
--------- end of 'get_url_params', returning params: {'page[size]': 2, 'sort': '-updated_at', 'page[number]': 11}
time=2021-12-04 074710 name=tap-krow level=INFO message=INFO METRIC: {'type': 'timer', 'metric': 'http_request_duration', 'value': 0.10503, 'tags': {'endpoint': '/organizations', 'http_status_code': 200, 'status': 'succeeded'}}
{"type": "RECORD", "stream": "organizations", "record": {"id": "2434b5b1-dcb3-400a-a874-bbdb789af2f0", "name": "the Emoji Movie Cafe Gift Shop The Game", "created_at": "2021-10-13T220340.256Z", "updated_at": "2021-11-11T224250.181Z"}, "time_extracted": "2021-12-04T144710.584584Z"}
{"type": "RECORD", "stream": "organizations", "record": {"id": "128cb7f0-afbc-46fc-8459-5d22a32ab614", "name": "Treebeard's Tap House", "created_at": "2021-10-12T190326.888Z", "updated_at": "2021-11-11T224250.167Z"}, "time_extracted": "2021-12-04T144710.585137Z"}
--------- end of 'get_next_page_token'. Returning next_page_token: {'stop_point': None, 'current_page': 12}
--------- end of 'get_url_params', returning params: {'page[size]': 2, 'sort': '-updated_at', 'page[number]': 12}
time=2021-12-04 074710 name=tap-krow level=INFO message=INFO METRIC: {'type': 'timer', 'metric': 'http_request_duration', 'value': 0.119964, 'tags': {'endpoint': '/organizations', 'http_status_code': 200, 'status': 'succeeded'}}
{"type": "RECORD", "stream": "organizations", "record": {"id": "10415ac7-058d-4811-bbb1-43a8488ad440", "name": "311 has grassroots, true", "created_at": "2021-10-26T201019.638Z", "updated_at": "2021-11-11T224250.157Z"}, "time_extracted": "2021-12-04T144710.712124Z"}
{"type": "RECORD", "stream": "organizations", "record": {"id": "71ba16a3-c4c6-4c16-9ea1-8b63bacee9ba", "name": "kobe", "created_at": "2021-06-24T193747.738Z", "updated_at": "2021-11-11T224009.532Z"}, "time_extracted": "2021-12-04T144710.712615Z"}
--------- end of 'get_next_page_token'. Returning next_page_token: None
time=2021-12-04 074710 name=tap-krow level=INFO message=INFO METRIC: {'type': 'counter', 'metric': 'record_count', 'value': 24, 'tags': {'stream': 'organizations'}}
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 474, in cli
tap.sync_all()
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 343, in sync_all
stream.sync()
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 984, in sync
self._sync_records(context)
File "/home/adam/.cache/pypoetry/virtualenvs/tap-krow-YeTW9i77-py3.8/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 958, in _sync_records
self._write_state_message()
File "/home/adam/.cache/pypoetry/virtualenvs/t…adam_roderick
12/04/2021, 2:50 PMadam_roderick
12/04/2021, 2:51 PMadam_roderick
12/06/2021, 12:46 PMprevious_token = {"stop_point": self.get_starting_timestamp(self.stream_state), "current_page": 1}
then I do not see the circular reference error. Digging into the get_starting_timestamp
function nowadam_roderick
12/09/2021, 11:47 AMprint(
o["value"]["bookmarks"]["organizations"]["partitions"][0]["context"]["partitions"][0]["context"]["partitions"][0]['context']['partitions']
)
adam_roderick
12/09/2021, 11:52 AMself.get_starting_timestamp(self.get_context_state(None))
it looks like the SDK is trying to enable partitions. If I exclude that call, then the final state message looks as I expect (and there is no circular reference error)
{
"type": "STATE",
"value": {
"bookmarks": {
"organizations": {
"replication_key": "updated_at",
"replication_key_value": "2020-11-30T21:44:28.839Z"
}
}
}
}
adam_roderick
12/09/2021, 12:00 PMself.get_starting_timestamp(None)
does not show the same circular reference behavior, treats state the way I expect (without partitions), and pulls in the replication key value from the state filee