xiaozhou_wang
07/01/2025, 3:51 PM- name: tap-github
variant: meltanolabs
config:
flattening_enabled: false
repositories:
- XXXXXX
- XXXXXX
start_date: '2020-01-01'
select:
- commits.*
- events.*
- reviews.*
- issues.*
- pull_request_commits.*
- pull_requests.*
It runs fine for about 40 minutes then hits the following error.
singer_sdk.exceptions.RetriableAPIError: 403 Client Error: b'{"message":"API rate limit exceeded for user ID 12345. If you reach out to GitHub Support for help, please include the request ID XXXXXXX and timestamp 2025-07-01 15:15:50 UTC.","documentation_url":"<https://docs.github.com/rest/overview/rate-limits-for-the-rest-api>","status":"403"}' (Reason: Forbidden) for path: /repos/XXXXXXX/pulls/12345/commits
I understand that this is due to a github API rate limit being hit. The issue is, is there a way around this while still being able to pull the full history of data? tap-postgres doesn't seem to support an end_date
parameter and it also doesn't seem to support a limit
. What I'm hoping for is the task to run successfully and then for the bookmark to increment. That way, even if this requires multiple runs over many hours, it is possible to get through the full history chunk by chunk.
Not super clear what rate_limit_buffer
does.visch
07/01/2025, 4:39 PMxiaozhou_wang
07/01/2025, 5:32 PMvisch
07/01/2025, 6:04 PMadditional_auth_tokens
?xiaozhou_wang
07/01/2025, 6:09 PMvisch
07/01/2025, 6:16 PMEdgar Ramírez (Arch.dev)
07/01/2025, 6:16 PMis_sorted = True
in their stream classes to make interruptions safer.xiaozhou_wang
07/01/2025, 6:23 PMxiaozhou_wang
07/01/2025, 6:29 PMsince
and until
which are timestamps to filter.
List Pull Requests uses sort
and page size / page number
List Reviews has neither (although typically there will be fewer of these than PRs)xiaozhou_wang
07/01/2025, 6:35 PM