josh_lloyd
11/29/2021, 8:22 PM--log-level=debug
):
tap-pendo | extractor | time=2021-11-29 16:56:24 name=tap-pendo level=INFO message=Beginning incremental sync of 'guideEvents' with context: {'guideId': '<abcDEF>'}...
And then it will run for days (literally) without ever making another log.
At first, I assumed this had something to do with the recent release of the SDK (this commit in particular) . But I can see other places in the logs where the error logs from the new validate_response
method is being produced.
I don鈥檛 know exactly how to reproduce this because the API I鈥檓 using doesn鈥檛 produce this behavior on a predictable basis. All I know is that the tap is built for the Pendo API. Without any useful logs I don鈥檛 know where else to look to debug. Any thoughts?aaronsteers
11/29/2021, 8:51 PMedgar_ramirez_mondragon
11/29/2021, 9:24 PMjosh_lloyd
11/29/2021, 10:15 PMjosh_lloyd
11/29/2021, 10:17 PMjosh_lloyd
11/29/2021, 10:21 PMaaronsteers
11/29/2021, 10:29 PMaaronsteers
11/29/2021, 10:32 PM_request
and the decorator is added by overriding `request_decorator`: singer_sdk/streams/rest.py 路 main 路 Meltano / Meltano SDK for Singer Taps and Targets 路 GitLabaaronsteers
11/29/2021, 10:33 PMaaronsteers
11/29/2021, 10:35 PMjosh_lloyd
11/29/2021, 10:57 PMtap-pendo | extractor | time=2021-11-29 21:18:49 name=backoff level=INFO message=Backing off _request(...) for 0.4s (singer_sdk.exceptions.RetriableAPIError: 502 Server Error: Bad Gateway for path: /aggregation)
My previous override was working just like this. Since that private method no longer exists I doubt this is causing the hang up, but just in case, I鈥檒l remove it and try again.
If anything, the only significant difference I can see between your new backoff code and old override is that I鈥檝e added the giveup
param:
giveup=lambda e: e.response is not None and 400 <= e.response.status_code < 500,
but I鈥檓 not confident this explains the behavior.edgar_ramirez_mondragon
11/29/2021, 11:09 PMto prevent the silent failure or loop that's happening here.quick thing would be to add some logging there
josh_lloyd
12/03/2021, 6:12 PMrequests
package does not implement a default timeout, so I overrode the _request
function to include a timeout and added the timeout exception to the backoff decorator. Since doing that, I have not seen any more hangs in this location in the logs. It also sped up the average time to pipeline completion 馃檪 .
Having done this, I wonder if it makes sense to add a default (but easily configurable) timeout within the SDK. Something high like 5 or 10 minutes of course, but at least in doing so developers would eventually get some log of what happened and then they can adjust the timeout limit to their liking. Just a thought.aaronsteers
12/03/2021, 6:16 PMedgar_ramirez_mondragon
12/03/2021, 6:38 PMI wonder if it makes sense to add a default (but easily configurable) timeout within the SDK. Something high like 5 or 10 minutes of courseThat sounds like a good idea